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ABSTItACTzFcrrept.ual  studies  suggest  that  the  visual  system  uses  thr'®iigidityS> as¬ 
sumption  to  recover  three  dimensional  structure  from  motion.  Tillman  (1981)  recently 
proposed  a  computational  scheme,  the  incremental  rigidity  scheme,  which  uses  the  rigid¬ 
ity  assumption  to  recover  the  structure  of  rigid  and  non  rigid  objects  in  motion.  The 
scheme  assumes  the  input,  to  be  discrete  positions  of  elements  in  motion,  under  ortho¬ 
graphic  projection.  We  present  formulations  of  Tillman's  met  lux]  that  use  velocity  infor¬ 
mation  and  perspective  projection  in  the  recovery  of  structure.  Theoretical  and  computer 
analyses  show  that  the  velocity  based  formulations  provide  a  rough  estimate  of  structure 
quickly,  but.  are  not  robust  over  an  extended  time  period.  The  stable  long  term  recovery 
of  structure  requires  disparate  views  of  moving  objects.  Our  analysis  raises  interesting 
questions  regarding  the  recovery  of  structure  from  motion  in  the  human  visual  system.  V  v  t 
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1.  Introduction 


An  important  sourer  of  three  dimensional  information  is  provided  I »y  the  relative*  mo¬ 
tions  of  elements  ill  file  changing  two  dimensional  image.  Tin*  liuman  visual  system 
is  capable  of  recovering  struct  lire  from  motion  under  liotli  orthographic  and  perspiv- 
live  projection,  and  in  the  absence  of  all  oilier  cues  to  3  I)  structure  (see.  for  example. 
Miles.  19.11;  Wallacli  and  ( )'< 'oinieli.  1953:  llraiinsteiu.  I97(>;  Johansson.  l‘.)78:  Ullman. 
1979).  In  studying  the  computation  of  structure  from  motion,  one  immediately  faces 
tin*  problem  that  tin*  recovery  of  structure  is  nmlercoust  rained;  there  are  iuliuitely  many 
a  I)  structures  consistent  with  a  given  pattern  of  motion  in  the  changin';  2  I)  image. 
Additional  constraint  is  required  to  establish  a  uuiipie  interpret  at  ion. 


ICarly  perceptual  studies  suggested  that  the  rigidity  of  objects  may  play  a  key  role  in 
the  recovery  of  structure  from  motion  (Wallacli  and  ()  ('onnell.  1953;  (libson  and  (libsou. 
1957;  Green.  11X51:  Johansson.  1975,  1977).  Computational  studies  later  established  that 
rigidity  is  a  sufliejcnt  ly  powerful  const  raint  to  derive  ;i  unique  interpret  at  ion  of  si  met  lire, 
under  a  variety  or  viewing  conditions.  For  example,  Ullman  and  Fremliu  (Ullman.  1979) 
showed  that  under  orthographic  projection,  three  views  of  four  non  coplanar  points  are 
sufficient  to  guarantee  a  unique  3  D  interpretation  (up  to  an  unavoidable  reflection  about 
tin*  image  plane).  In  the  case  of  perspective  projection.  Longuet  Higgins  and  Pra/.dny 
(1981)  proved  that  the  instantaneous  velocity  held  and  its  first  and  second  spatial  deriva¬ 
tives  at  a  point  admit  at  most  three  different.  3  D  interpretations.  Tsai  and  Huang  (1981) 
showed  that,  with  the  exception  of  a  few  special  configurations,  two  perspective  views 
of  seven  points  in  motion  are  sufficient  to  guarantee  a  unique  3  D  interpretation.  Wax- 
man  and  Ullman  (1981)  also  addressed  the  uniqueness  of  the  recovery  of  structure  under 
perspective  projection,  basing  their  results  on  a  kinematic  analysis  of  continuous  image 
flows.  Additional  theoretical  results  have  been  obtained  for  various  classes  of  restricted 
motion,  such  as  planar  surfaces  in  motion  (Hay,  1900;  Longue!  Higgins.  1984;  Waxinan 
and  Ullman.  1985:  Ullman.  1985;  Negahdaripour  and  Horn,  1985),  pun*  trauslatory  mo¬ 
tion  (Olocksin,  1980),  planar  or  fixed  axis  rotation  (Hodman  and  Flinchbaugh,  1982: 
Webb  and  Aggarwal.  19SI;  Bobick,  1983;  Bennett  and  Hoffman.  I984a,h;  Sugie  and  Iua- 
gaki.  1981).  and  translation  perpendicular  to  the  rotation  axis  (Longuet  Higgins,  1983). 
A  review  of  the  theoretical  results  regarding  the  recovery  of  structure  from  motion  can 
be  found  in  Ullman  (1983). 


From  theoretical  studies  of  the  structure  from  motion  problem,  it  can  he  concluded 
that  by  exploiting  a  rigidity  constraint.  3  D  structure  can  be  recovered  from  motion 
alone,  using  image  information  that  is  integrated  over  a  small  extent  in  space  and  in 
lime.  These  theoretical  studies  have  also  given  rise  to  algorithms  for  deriving  the  rigid 
3  D  st  ructure  of  moving  objects  (for  example,  Ullman.  1979;  Longuet  Higgins,  1981;  Tsai 
and  Huang,  1981).  (experimentation  with  these  algorithms  has  revealed  two  important 
limitations.  First,  although  it  is  possible  in  theory  to  recover  structure  from  motion 
information  that  is  integrated  over  a  small  extent  in  space  and  time,  such  a  strategy 
may  not  be  robust,  in  practice.  A  small  amount  of  error  in  the  image  measurements 
can  lead  to  very  different  solid  ions  (Ullman,  1983).  Second,  most  previous  algorithms 
derive  a  three  dimensional  structure  only  when  a  rigid  interpretation  is  possible,  and 
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<>l  hcrwise  do  not  yield  any  interpret  at  inn  of  st  met  lire  nr  yield  a  >nlnl  inn  I  ha!  is  iin  nrrecl 
nr  unstable. 

The  first  observation  above  suggests  that  a  robust  algorit  Inn  for  recovering  .-!  rucl  ure 
should  use  mot  ion  informal  inn  that  is  more  extended  in  space  nr  time.  This  cnuclusiou 
is  supported  in  recent  computational  studies  by  Nagalidarinour  and  Horn  ( I 'JHT* J  and 
Ullmau  (1084).  Ncgahdaripoiir  and  Horn  addressed  the  recovery  of  the  motion  of  an 
observer  relative  l.n  a  stationary  planar  surface.  It  was  shown  that  a  robust  recovery 
of  both  the  observer  motion  and  the  orientation  of  the  plane  is  possible  when  dense 
measurements  of  the  spatial  and  temporal  derivatives  of  linage  brightness  are  integrated 
over  a  large  region  of  the  changing  image.  Thus,  consideration  of  motion  information 
that  is  more  extended  in  space  can  lead  to  a  stable  recovery  of  structure,  (truss  and 
Horn  (1983)  also  proposed  an  algorithm  for  recovering  the  motion  of  an  observer  relative 
to  a  stationary  scene,  which  integrates  motion  information  over  an  extended  region  of 
the  image.  Tin'  study  by  (Illman  (1984),  which  will  be  developed  further  in  this  paper, 
demons)  rated  that  a  robust  recovery  of  struct  ure  is  also  possible  when  mol  ion  informal  ion 
is  integrated  over  an  exl (‘tided  period  of  time.  The  extension  in  time  can  be  achieved, 
for  example,  by  considering  a  large  number  of  discrete  frames  or  by  observing  continuous 
motion  over  a  significant  temporal  extent. 

With  regard  to  the  human  visual  system,  the  dependence  of  perceived  structure  on 
the  spatial  and  temporal  extent  of  the  viewed  motion  has  not  yet  been  studied  systemati¬ 
cally,  but  the  following  informal  observations  have  been  made.  Regarding  spatial  extent., 
two  or  three  points  undergoing  relative  motion  are  suflicicnt  to  elicit  a  perception  of  3  D 
structure  (Borjcssou  and  von  Hofsten,  1973;  Johansson,  1975),  although  theoretically  the 
recovery  of  structure  is  less  constrained  for  two  points  in  motion,  and  perceptually  the 
sensation  of  st  ructure  is  weaker.  An  increase  in  the  number  of  moving  elements  in  view 
appears  to  have  little  affect,  on  the  quality  of  perceived  structure  (for  example.  IVtcrsik, 
1980).  Regarding  the  temporal  extent  of  viewed  motion,  Johansson  (1975)  showed  that 
a  brief  observation  of  patterns  of  moving  lights  generated  by  human  figures  moving  in 
the  dark  (commonly  referred  to  as  biological  motion  displays)  can  lead  to  a  perception  of 
the  3  I)  motion  and  structure  of  the  figures.  Other  perceptual  studies  indicate  that  the 
human  visual  system  requires  an  extended  time  period  to  reach  an  accurate  perception  of 
3  D  structure  (Wallach  and  O'Connell,  1953:  White  and  Muescr,  1900;  Creen.  1961).  A 
brief  observation  of  a  moving  pattern  sometimes  yields  an  impression  of  structure  that  is 
“flatter”  than  the  true  structure  of  the  moving  object  ((Illman,  1984).  Thus,  the  human 
visual  system  is  capable  of  deriving  some  sense  of  structure  from  mot  ion  information  that 
is  integrated  over  a  small  extent  in  space  and  time.  An  accurate  perception  of  structure 
may,  however,  require  a  more  extended  viewing  period. 

It  was  noted  earlier  that  most  algorithms  for  recovering  structure  from  motion  arc 
unable  to  interpret  nonrigid  motions.  There  are.  however,  some  exceptions  to  Ibis.  Ben¬ 
nett.  and  llolfmau  (1981b)  studied  the  minimum  amount  of  motion  information  required 
to  derive  a  unique  interpretation  of  the  structure  of  a  set  of  discrete  elements  undergoing 
nonrigid  motion,  when  it  is  assumed  that  the  elements  are  rotating  about  a  fixed  axis 
in  space.  Jlnliiiian  and  Flinrhbaugli  (I98J)  proposed  an  algorithm  !  >r  interpreting  the 
3  L)  motion  and  structure  in  biological  motion  displays.  'Phis  algorithm  decomposes  the 


overall  nonrigid  mol  ion  into  pairs  of  points  I  lial  are  rigidly  linked  and  rolal  ing  in  a  plane, 
and  triplets  of  points  forming  two  hinged  links  that  rotate  in  tin-  same  plane.  Koenderink 
and  Van  Doom  (1081:  Koenderink.  1984)  examined  tin*  class  of  bending  deformations, 
wliieli  satisfy  the  physical  constraint  that  distances  along  the  surface  of  the  object  are 
preserved  by  the  transformation.  This  class  of  defonnal ions  excludes  any  stretching  nr 
compressing  of  1 1 u*  object  surface.  In  its  current  formulation,  the  method  proposed  by 
Koenderink  and  Van  Doom  for  recovering  I  he  si  rncl  lire  of  bending  surfaces  requires  that 
the  surfaces  be  complete,  in  contrast,  with  other  algorithms  that  are  able  to  interpret  the 
structure  of  isolated  points  in  motion.  To  conclude,  the  algorithms  discussed  above  rui¬ 
n-covering  the  si  rncl  lire  of  nonrigid  objects  in  motion  all  address  restricted  classes  of 
these  motions,  such  as  lixed  axis  motion,  planar  motion,  and  bending  deformations. 

Tin*  mechanism  for  recovering  structure  from  motion  in  the  human  visual  system 
appears  not  to  bo  based  strictly  on  the  rigidity  .assumption.  It  is  an  everyday  experience 
to  perceive  the  structure  and  motion  of  deforming  objects  such  as  a  Mowing  river,  an 
expanding  balloon,  or  a  dancing  ballerina.  Perceptual  studies  reveal  that  the  human 
visual  system  can  derive  some  sense  of  structure  for  a  broad  range  of  nonrigid  motions, 
including  stretching,  bending  and  even  more  complex  types  of  deformations  (Johansson, 
1904,  1978;  Jansson  and  Johansson,  1973;  Todd,  1982,  1984).  It  is  also  the  cane  that 
displays  of  rigid  objects  in  motion  sometimes  give  rise  to  the  perception  of  somewhat 
distorting  objects  (Wallach,  Weisz  and  Adams,  195G;  White-  and  Muoser.  1900;  Green, 
1901;  Bramistein,  1902;  Sperling  et  al.,  1983;  Hildreth.  1984a;  Adelaon,  1985). 

In  this  paper,  we  focus  on  the  recent  work  of  Ullman  (1984),  which  provides  a 
more  flexible  method  for  deriving  the  structure  of  rigid  and  nonrigid  objects  in  motion, 
and  provides  a  natural  means  for  integrating  motion  information  over  an  extended  time 
period.  This  method  makes  use  of  the  rigidity  assumption,  but  in  a  different  way  from 
previous  studies.  The  algorithm,  called  the  incremental  rigidity  .scheme ,  maintains  an 
int  ernal  model  of  the  structure  of  a  moving  object  ,  which  is  continually  updated  as  new 
positions  of  image  elements  are  considered.  The  initial  model  may  be  Hat.  if  no  other  cues 
to  3  D  structure  are  present,  or  it.  may  be  determined  by  other  cues  available,  for  example, 
from  binocular  stereo,  shading,  texture,  and  perspective  (Marr,  1982;  Ballard  and  Brown, 
1982;  Horn,  1985).  As  each  new  view  of  the  moving  object  appears,  the  algorithm 
computes  a  new  set  of  3  D  coordinates  for  points  on  the  object,  which  maximizes  the 
rigidity  in  the  trait  formation  from  the  current  model  to  the  new  positions.  In  particular, 
the  algorithm  minimizes  the  change  in  the  3  D  distances  between  points  in  the  model. 
The-  formulation  presented  by  Ullman  assumes  the  input  to  the  recovery  process  to  consist 
of  a  sequence  of  discrete  frames,  each  containing  a  set  of  discrete  feature  points.  Through 
the  process  of  repeatedly  considering  a  new  frame  in  the  sequence  and  updating  the 
current  model  of  the  struct  ure  of  the  moving  features,  the  incremental  rigidity  scheme 
builds  up  and  maintains  a  3  D  model,  and  can  bo  applied  to  both  rigid  and  nonrigid 
objects  in  motion.  Further  details  of  the  incremental  rigidity  scheme  are  presented  in 
section  2  and  in  Appendix  A. 

The  incremental  rigidity  scheme  has  a  number  of  advantages,  from  a  computational 
perspective  (Ullman,  1984):  (I)  because  it  integrates  information  over  an  extended  time 
period,  it  provides  a  stable  recovery  of  structure,  particularly  in  the  presence  of  error  in 
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lilt*  imago  measurements.  (2)  it  allows  Jovial  ions  from  ri);t«ltty.  while  always  iiiaiiilniiiiug 
some  model  of  the  2  I)  structure  of  the  object,  (.*>)  it  provides  a  natural  means  lor 
interactions  with  other  sources  of  3  I)  informal  ion,  and  (1)  empirical  studies  suggest 
that  the  algorithm  is  able  to  recover  the  correct  3  I)  structure.  The  behavior  of  this 
algorithm  is  also  consistent  in  several  ways  with  human  perceptual  behavior  (Ullmau. 
1984). 


In  this  paper,  wo  develop  Ullmau 's  work  further,  in  several  directions.  First,  in 
section  2,  we  present  a  continuous  formulat  ion  of  the  incremental  rigidity  scheme,  which 
uses  velocity  information  at  discrete  points  as  input  to  the  recovery  process.  In  section 
3,  we  then  examine  in  more  detail,  the  behavior  of  the  incremental  rigidity  scheme  when 
presented  with  rigid  objects  undergoing  rotation  about  a  single  axis  in  space,  under 
orthographic  projection.  In  particular,  through  computer  simulations  and  a  theoretical 
analysis,  we  examine  the  behavior  of  t ho  discrete  formula)  ion  as  a  fund  ion  of  the  angular 
displacement  between  frames,  and  compare  this  behavior  with  that  of  the  continuous 
formulation.  Finally,  in  section  4.  we  present  both  discrete  and  continuous  formulations 
of  the  incremental  rigidity  scheme  that  ust*  perspective  projection.  Through  computer 
simulations,  we  begin  to  examine  the  behavior  of  the  perspective  formulations,  when 
presented  with  rigid  objects  undergoing  both  pure  rotation  about  a  single  axis,  and  pure 
translation  through  space. 


The  main  conclusions  of  the  paper  are  the  following.  Tin*  direct,  use*  of  velocity 
information  as  input  to  the  incremental  rigidity  scheme  can  provide  a  rough  estimate  of 
t.he  structure  of  a  moving  object,  over  a  short  viewing  period,  but  is  not  sulliciently  pow- 
erful  to  allow  a  detailed  and  robust  recovery  of  structure  over  an  extended  time  period. 
The  computation  of  a  stable  long  term  solution  appears  to  require  the  use*  of  views  of 
a  moving  object  that  differ  significantly.  This  implies  the  need  for  a  recovery  process 
with  memory  of  the  past  views,  but  this  memory  need  not  be  extended  indefinitely  and 
continuously  into  the  pant.  A  small  number  of  discrete  views  of  a  moving  object  are  suf¬ 
ficient  for  recovering  3  .D  structure,  Jn  the  case  of  the  incremental  rigidity  scheme,  the 
use*  at  every  instant  of  a  curreut  model  of  the  3  D  structure  of  the  object,  and  a  preseut 
view  that  is  sufficiently  different  from  preceding  views,  can  provide  a  robust  recovery  of 
structure*.  In  the  case  of  rotation  of  a  rigid  object,  about  a  single  axis  in  space,  both  the 
rate  of  convergence  of  the  algorithm  to  the  final  solution  and  the  quality  of  the  solution 


decrease  as  smaller  angular  displacements  between  viewed  frames  are  considered.  In  the 
limit  of  the  continuous  formulation,  the  solution  is  no  longer  stable.  The  behavior  of  the 
perspective  formulation  of  the  incremental  rigidity  scheme  is  more  complex  than  that, 
of  the  orthographic  formulation.  We  found  that,  if  the  absolute  position  of  an  object 
in  space  is  known  throughout  the  motion  of  the  object,  then  the  perspective  formula- 
tfou  performs  well,  similar  to  Hie  orthographic  formulation.  The  results  of  computer 
simulations  revealed  a  degradation  in  performance  with  smaller  angular  ami  spatial  dis¬ 
placements  between  flames,  but  this  degrad.it ion  was  somewhat,  more  severe  than  in  the 
orthographic  formulation.  Again,  in  the  limit  of  the  continuous  formulation,  the  solution 
is  no  longer  stable.  Our  analysis  raises  important  questions  regarding  the  quantitative 
ability  with  which  the  human  visual  system  can  recover  structure  from  motion-  these 
questions  are  discussed  in  section  5. 


2.  Discrete  ami  Continuous  Formulations  of  the  Incremental  Rigidity 
Scheme 

In  this  sort  ion,  we  lirst  describe  Ullman  s  (1981)  original  formulation  of  the  inert  mental 
rigidity  scheme,  which  assumes  the  visual  input  to  consist  of  a  sequence  of  frames,  each 
containing  a  number  of  discrete  points  that  may  correspond  to  identifiable  features  in  the 
changing  image.  We  then  present  a  formula! ion  that  uses  velocity  information  at  discrete 
points  in  a  continuously  changing  image  as  input  to  the  recovery  process.  The  analysis 
in  this  section  assumes  orthographic  projection  of  the  scene  onto  the  image  plain*. 

Tin*  motivations  for  considering  a  continuous  formulation  are  threefold.  First,  on  the 
basis  of  the  results  of  computer  simulations.  Ullman  (1981)  noted  that  when  analyzing 
objects  undergoing  rigid  rotation,  the  convergence  of  the  incremental  rigidity  scheme  to 
the  correct  solution  was  slower  when  smaller  angular  separations  between  frames  were 
used.  This  suggests  that  the  scheme  may  perform  better  when  successive  views  of  the 
object  differ  significantly.  We  considered  the  limit  of  arbitrarily  close  frames,  both  as 
a  means  of  studying  this  phenomenon,  and  to  determine  whether  a  robust  recovery  of 
structure  is  still  possible  under  these  conditions.  A  second  motivation  is  that  recent 
work  on  the  computation  of  an  instantaneous  2  D  velocity  field  from  the  changing  im¬ 
age  suggests  that,  a  unique  velocity  held  can  be  obtained  for  general  classes  of  motion, 
exploiting  a  constraint  on  the  smoothness  of  the  velocity  field  (Horn  and  Schunck,  1981; 
Hildreth,  1984a, b;  Nagel,  1984).  Ultimately,  it  may  be  useful  to  integrate  the  results  of 
such  velocity  field  computations  with  the  recovery  of  structure  from  motion.  A  third 
motivation  is  that  Ullman’s  formulation  of  the  incremental  rigidity  scheme  leads  to  the 
solution  of  a  set  of  nonlinear  equations.  It  is  shown  in  Appendix  A  that  the  continuous 
formulation  presented  here  leads  to  the  solution  of  a  set  of  linear  equations.  This  makes 
a  theoretical  analysis  of  the  solution  more  accessible,  and  could  in  principle  result  in  a 
more  efficient  computer  implementation. 

2,1  Ullman’s  Discrete  Formulation 

Tin*  incremental  rigidity  scheme  maintains  and  updates  an  internal  model  M(t)  of  the 
viewed  object,  which  consists  of  a  set  of  3  D  coordinates:  M(t)  =  (*,-(<),  yi(t),  2,(<)). 
In  this  section,  we  assume  orthographic  projection  (the  case  of  perspective  projection  is 
addressed  in  section  4)  onto  the  X  —  Y  image  plane,  so  that  (x,({),  j/,(t))  arc*  the  image 
coordinates  of  the  i  th  point,  and  Zi{t)  is  the  current  estimate  of  the  depth  at  the  i  tli 
point..  (We  assume  a  left  handl’d  coordinate  system,  with  the  positive  /,  axis  pointing 
away  from  the  observer.)  In  Ullmau’s  formulation,  when  no  other  3  D  cues  are  present, 
the  initial  model  M(l)  at  t.  —  0  is  taken  to  he  flat;  that  is,  £*(<))  —  0  (or  some  other 
constant  value)  for  i  —  1, . . . ,  n,  where  n  is  the  number  of  points  in  motion.  In  principle, 
other  initial  configurations  could  also  he  considered.  The  theoretical  analysis  of  section  3 
examines  the  long  term  stability  of  the  incremental  rigidity  scheme,  independent  of  the 
initial  model  of  the  structure  of  the  moving  points. 

(liven  a  current  model  M(t)  at  time  t,  and  the  image  of  the  moving  points  in  a 
new  frame  at  a  later  time  t' .  the  problem  is  to  compute  a  new  model  M(t')  such  that 
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tile  transformation  from  A/(/.)  In  M(l')  is  as  ri^i<  1  as  possible.  Since  rt(/')  ami  //,(/’)  are 
known,  this  requires  the  computation  of  (lie  unknown  depth  values  (It  is  assumed 

tlial  die  correspondence  belwivu  points  in  the  Iwo  successive  frames  is  known.)  The  new 
depth  values  are  computed  as  follows.  Let  ltJ(l)  denote  the  distance  between  points  i 
and  j  at.  time  L  To  make  the  transformation  as  rigid  as  possible,  tin-  values  z,  (/')  for 
the  new  model  are  chosen  so  as  to  make  ltj(l)  and  /{_,(/*)  as  similar  as  possible.  I«\»r  this 
purpose,  llilman  defined  a  measure  of  die  difference  between  and  /,■_,(/')  as: 


d(lijWiilt')) 


(MO  -  MO)2 


and  formulated  the  recovery  of  structure  as  the  computation  of  :?,(/')  that  minimize  the 
following  overall  deviation  from  rigidity: 

D(U')  =  V\/(ftJ :(<')).  (2.2) 

hi 

After  the  values  Zi[t')  have  heen  determined  using  this  iniiiimi/.atioii  process,  the  new 
model  M(t')  —  (ij(t'), ?/,(<').  2, ( it' ) )  becomes  the  current,  model.  A  new  frame  is  then 
registered  and  die  process  repeats  itself.  In  this  way,  the  scheme  maintains  rigidity  by 
keeping  the  total  distances  between  points  in  the  model  as  constant  as  possible.  The 
motivation  for  the  cubic  factor  in  the  denominator  of  Eq.  (2.1)  is  that,  the  nearest 
neighbors  to  a  given  point  are  more  likely  to  belong  to  tin*  same  object  tluui  distant 
neighbors,  so  that  a  point  is  more  likely  to  move  rigidly  with  its  nearest,  neighbors,  The 
/?•(*)  factor  diminishes  the  influence  of  distant  points  in  the  recovery  process. 

It  should  be  tinted  that  in  the  case  of  orthographic  projection,  only  relative  depth 
values,  Zi(t)  —  z3(t),  can  be  recovered,  rather  than  absolute  depth  values,  because  under 
this  form  of  projection,  the  image  of  a  given  object  does  not.  change  with  its  absolute 
depth.  In  addition,  3  D  structure  is  determined  only  up  to  a  reflection  about,  the  image 
plane,  since  the  orthographic  projection  of  a  rotating  object,  and  its  mirror  image  rotat  ing 
in  the  opposite  direction,  coincide. 

2.2  The  Continuous  Formulation 

A  continuous  formulation,  which  uses  velocity  information  at  discrete  feature  points, 
can  be  developed  as  follows.  Assume  again  that  there  always  exists  an  internal  model 
A/(f)  =  (xi(t),yi(t),Zi  (t)).  Assume  also  that  the  image  velocities  xt(/.)  and  ih{t)  are 
known.  The  problem  is  then  formulated  as  the  computation  of  the  z  components  of 
velocity,  that  minimize  the  total  continuous  change  in  the  distances  between  the 

points.  The  general  form  of  the  measure  of  overall  deviation  from  rigidity  is  given  by: 

oc[t)  =  X>(M0)-  (2.3) 

hj 

In  onr  analysis,  wc  consider  different  possibilities  for  the  measure.  <f,  (ltJ(l)).  The  theo¬ 
retical  development  of  section  3.2.  for  example,  considers  the  behavior  of  the  incremental 
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rigidity  scheme.  as  the  frames  become  mlitiilesimallv  Hose  nsini'  I  lie  following  disc  ret < 
measure  of  (lie  change  in  I  lie  distance  lielween  pairs  of  points: 
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Th<*  derival  ion  of  the  continuous  measure  of  rigidity  is  outlined  in  section  3.2.J.  It  results 
ill  the  following  expression  for  /^r (/ ) ,  as  a  function  of  the  coordinates  and  velocities  of 
(lie  points  (the  arguments,  I,  have  boon  omitted  for  simplicity): 

Dr(t)  ^2((xi  -  xJ)(sl  ij)  I  (;//,  yJ){yl  -  //•,)  +  (r,  -  Zj)(zt  Zj))'1 .  (2.5) 


This  particular  measure  of  rigidity  was  used  in  the  theoretical  study  for  analytic  sim¬ 
plicity.  The  computer  simulations  of  the  continuous  formulation  use-  the  measure  of 
change  in  the  distance  between  points  given  again  by  the  limit  of  /,y(<'))2,  for 

infinitesimally  close  frames,  which  is: 

Mtun)  -  (MO)2.  (2.C) 

In  terms  of  the  coordinates  and  velocities  of  the  points,  this  yields  the  following  overall 
measure  of  deviation  from  rigidity: 

nm  _  V'  ((*»  -  Xj)(X>  -  jy)  +  (Vi  -  »j)(gi  -  Vj)  +  (*.  -  zi)[*i  ~  ^))2  7x 

(*<  -  z,y  +  (■,/,  -  ytf  h  (z<  -  ztf  '  1  ] 

xtJ 

(Eqs.  (2.5)  and  (2.7)  use  slightly  different,  measures  of  rigidity,  but  serve  the  same  role 
as  measures  of  overall  changes  of  rigidity  for  the  continuous  formulation.  In  the  present 
paper  we  do  not  use  different  notations  for  these  measures,  as  it  will  be  clear  from  the 
context  which  measure  is  used.  More  specifically,  Eq.  (2.5)  is  used  in  the  theoretical 
analysis  of  sect  ion  3.2,  and  Eq.  (2.7)  is  used  in  the  computer  simulations  of  section  3.1.) 
In  other  respects,  the  continuous  formulation  is  similar  to  the  discrete  formulation.  A 
model  of  the  structure  of  the  moving  points  is  built  up  by  continually  taking  into  account 
new  velocity  information  over  an  extended  time  period.  Again,  because  orthographic 
projection  is  used,  only  relative  velocities,  i,( t )  —  Zj(t),  can  be  recovered.  This  can 
clearly  be  seen  in  Eqs.  (2.5)  and  (2.7),  in  which  the  coordinates  of  the  points  and  their 
time  derivatives  all  appear  in  differences  between  pairs.  Further  details  of  the  continuous 
formulation  are  presented  in  section  3  and  Appendix  A. 

Tin'  analyses  presented  in  this  paper  mainly  consider  single  rigid  objects  in  motion, 
which  are  compact  in  the  sense  that  the  internal  distances  between  pairs  of  points  do  not 
differ  much  from  oik*  another.  In  this  case,  the  additional  ij.  factor  of  Eq.  (2.1)  has  little 
inllnonre  on  the  behavior  of  the  algorithm,  so  wo  omitted  it  in  our  theoretical  analysis  and 
computer  simulations  for  the  sake  of  simplicity  In  general,  however,  a  proper  weighting 
(and  not  necessarily  a  cubic  factor)  of  the  iulluence  of  different  distances  among  points 
is  necessary  for  a  better  performance  of  tin1  algorithm. 
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3.  Position**  vs.  Velocities  as  Input  to  the  Recovery  of  Structure 

We  stated  earlier  that  on  the  basis  of  computer  simulations.  Ulliuan  (1981)  observed 
that  when  analyzing  objects  undergoing  rigid  rotation,  the  convergence  of  the  incremen¬ 
tal  rigidity  scheme  to  tin*  nimrt  solution  was  slower  when  smaller  angular  displacements 
between  frames  wen*  used.  In  this  section,  wo  analyze  this  phenomenon  from  a  theoret¬ 
ical  perspective,  focusing  on  tin*  long  term  stability  of  the  computed  I)  model.  We 
tirst.  examine  the  behavior  of  the  continuous  formula!  ion.  which  uses  a  current  model 
iuid  measured  imago  velocities  at  discrete  points  as  input  to  the  recovery  process.  We 
then  turn  to  the  discrete  formulation,  which  uses  the  discrete  positions  of  discrete  points 
;is  input,  and  examine  the  long  term  stability  of  its  solution  as  a  function  of  the  angular 
displacement  between  frames.  Our  main  conclusions  an-  the  following.  First,  for  the 
particular  class  of  motions  considered,  the  discrete  formulation  always  yields  a  3  D  so¬ 
lution  that  converges  asymptotically  to  the  correct  solution,  but  the  rate*  of  convergence 
varies  with  the  angular  displacement.  The  rate  of  convergence  increases  with  increasing 
angular  displacement  up  to  a  maximum,  and  then  dermises  with  further  increases  in 
this  displacement.  The  position  of  this  maximum  depends  on  such  factors  as  the  type  of 
motion  and  geometric  structure  of  the  points. 

Although  the  orthographic  projection  is  in  general  not  physically  valid,  it  is  used  liere 
because  it  allows  a  simpler  formulation  of  the  problem,  and  is  therefore  better  suited  to 
theoretical  analysis  and  computer  implementation.  It.  allowed  us  to  gain  a  deeper  insight 
into  the  nature  of  the  phenomena  studied.  We  nevertheless  implemented  the  equivalent 
perspective  formulation  and  confirmed  that  the  basic  results  remain  valid  under  this 
formulation.  The  use  of  perspective  projection  enables  the  recovery  of  the  structure 
of  objects  undergoing  pure  translation,  which  was  not  possible  under  the  orthographic 
projection.  In  the  case  of  translation,  the  rate  of  convergence  of  the  computed  3  D  model 
to  the  true  structure  also  increases  with  increasing  spatial  displacements  between  frames. 

Before  presenting  the  results  of  the  theoretical  analysis,  we  illustrate  the  behavior 
of  the  discrete  and  continuous  formulations  through  the  results  of  computer  simulations. 
We  show  that  the  continuous  formulation  yields  an  initial  fast  convergence  to  a  close 
.approximation  of  the  true  structure  of  the  moving  points,  but  then  oscillates  over  a  large 
range.  It  consequently  does  not  yield  a  stable  long  term  recovery  of  structure. 

3.1  Observations  from  Computer  Simulations 

In  this  section,  we  briefly  illustrate  the  behavior  of  the  discrete  and  continuous  formula¬ 
tions  of  the  incremental  rigidity  scheme,  for  the  special  case  of  rigid  rotation  of  a  small 
set  of  discrete  points  about  the  vertical  axis.  (For  details  of  the  computer  implementa¬ 
tion,  see  Appendix  A.)  hi  the  case  of  the  discrete  formulation,  we  examine  I  he  rate  of 
convergence  of  tin'  algorithm  and  the  quality  of  the  solution  that  it  yields,  as  a  function 
of  the  angular  displacement  between  frames.  We  then  compare  its  performance  with  that 
of  the  continuous  formulation.  In  all  of  the  examples  presented  here,  the  input,  consisted 
of  a  set  of  five  points  in  space.  The  first  point  is  assumed  to  lie  at  the  origin  of  a  co¬ 
ordinate  system  that  is  displaced  from  the  viewer  along  the  line  of  sight.  The  position 
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I  In*  remaining  four  points  were  chosen  randomly.  h'ip;.  I  i  1 1  list  r;  1 1  <-s  a  typical  si  of  live 
points,  showing  their  projections  onto  the  A  F  plane  ( !■' i K .  la)  anil  the  A  Z  plane 
( I ' i pr .  ||i).  In  (he  simulations,  the  projected  positions  anil  velocities  of  the  points  were 

computed  analyl  ically,  rather  than  measured  from  real  imago  sequences. 


Figure  1.  A  set  of  five  points  with  random  coordinates  is  projected  onto  (a)  the  X  -  F  plane, 
and  (b)  tlie  X  —  Z  plane. 

Fig.  2  illustrates  the  behavior  of  the  discrete  formulation  of  t  he  incremental  rigidity 
scheme,  as  a  function  of  the  angular  displacement  between  frames.  Bach  figure  shows 
a  birds’  eye  view  of  the  set  of  rotating  points  (that  is,  their  projection  onto  the  X  —  Z 
plane),  with  tilled  circles  representing  the  true  positions  of  the  points  and  open  circles 
representing  the  structure  computed  by  the  algorithm.  Fig.  2a  shows  the  true  positions 
of  the  points  and  the  initial  model  at  time*  t  =  0.  The  initial  model  is  assumed  to  be 
flat.  Figs.  2b  and  2c  show  the  true  and  computed  configurations  of  points  after  120°  of 
rotation.  This  final  position  was  reached  by  taking  three  steps  of  40r  (Fig.  2b).  and  12 
steps  of  10  (Fig.  2c).  It  can  be  seen  that  the  use  of  a  smaller  number  of  more  disparate 
views  yields  a  3  I)  model  that,  is  closer  to  the  true  solution.  As  noted  in  Appendix  A,  a 
steepest  descent  minimization  algorithm  was  used  for  most  of  our  computer  simulations. 
Wo  also  analyzed  a  small  number  of  examples  using  an  exhaustive  search  algorithm  to 
find  the  minimum  solution,  and  again  found  the  accuracy  of  the  solution  to  vary  wit li 
tbe  angular  displacement  between  frames.  We  therefore  believe  this  to  be  a  fundamental 
behavior  of  the  incremental  rigidity  scheme  that  is  not  simply  a  consequence  of  the 
pari  icular  algorithm  used  to  implement  the  scheme.  This  observation  is  supported  further 
by  tlie  theoretical  analysis  of  the  next  section. 

In  Fig.  3,  we  show  a  series  or  graphs  that  illustrate  in  a  different  way,  the  behavior 
of  t  he  dis<  rete  formulation  of  I  he  incremental  rigidity  scheme  as  a  function  of  angular 
displacement.  In  this  ease,  file  set  of  points  shown  in  Figs.  I  and  2  were  rotated  by 


four  full  revolutions.  Each  of  the  graphs  show  the  error  between  the  true  and  computed 
structures,  as  a  function  of  time.  In  particular,  the  following  quantity  is  plotted: 


LX  -  '«■)' 


where  d,,  is  the  correct  .‘5  U  distance  between  points,  i  and  j  in  the  object,  and  ltJ  is 


Figure  2.  (a)  The  true  configuration  of  live*  points  (filled  circles)  is  compart'd  with  the  initial 
configuration  of  the  points  in  the  model  (open  circles)  at  time  t  —  0.  The  projection  is  onto  the 
X  —  Z  plane,  (b)  The  comparison  between  the  true  and  computed  positions  of  the  points  after 
three  steps  of  40°.  (c)  The  comparison  between  the  true  and  computed  positions  of  the  points 
after  12  steps  of  10°. 
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the  corresponding  dislann*  be-twix'ii  points  i  and  /  in  Ihe  <( >11 1 1 x i f «-< I  model  Tin-  graphs 
shown  in  l*'i j'n.  ‘la  through  'At' c  orresponel  to  rotations  willi  angular  displacements  ol  40  . 
20  ,  10",  .r»  and  I  ,  respectively.  In  Pig.  ,‘lf.  the-  live  graphs  are  shown  superimposed. 
Again,  it  can  ho  sorni  that  (lie  rale  of  ronvorgoiico  and  quality  of  (lie  solution  improves 
with  larger  angular  displacements.  For  a  particular  total  angular  extent,  the  error  in  the 
computed  model  decreases  with  increasing  angular  displacement  between  frames,  fn  the 
case  of  40  displacements,  the  algorithm  converges  asymptotically  and  monotonically 
to  the  final  solution.  For  smaller  displacements,  tin*  convergence  is  no  longer  strictly 
mono  tonic,  but  is  still  essentially  asymptotic  toward  the  final  solution.  Fig.  4  shows 
tin*  same  set  of  graphs  superimposed,  but  with  the  error  plotted  on  a  log  scale.  The 
convergence  of  the  solution  is  now  essentially  linear,  with  varying  slopes,  suggesting  that 
the  actual  convergence  is  exponential. 


Figure  4.  Graphs  of  the  error  in  the  internal  distances  between  points  in  the  computed  3  D 
model,  as  a  function  of  time,  plotted  on  a  log  scale.  The  points  are  rotated  4  full  revolutions, 
in  steps  of  40°,  20",  10°,  5"  ;ind  lu.  The  graphs  are  shown  superimposed,  with  the  angular 
displacements  indicated  above  each  graph. 


Fig.  5  illustrates  the  behavior  of  the  continuous  formulation  of  the  incremental 
rigidity  scheme.  The  same  set.  of  points  used  previously  was  again  rotated  about  the 
vertical  axis  and  the  3  D  model  was  computed  at  infinitesimally  closely  spaced  times, 
using  the  instantaneous  velocities  projected  onto  the  image  (see  Appendix  A  for  details). 
In  Fig.  5a,  we  compare  the  true  positions  of  the  points  (filled  circle's)  with  the  be'st 
solution  (open  circles)  obtained  over  10  full  revolutions  of  the  points.  Although  the'  model 
is  quite  close  to  the  true  structure  at  this  position  of  the  points,  the*  solution  oscillates 
significantly  over  an  extended  time  period.  A  graph  of  the-  e'rror  in  the'  compute'd  model 
over  the  10  revolutions  is  showu  in  Fig.  5h.  There  is  an  initial  fast  convergence'  toward 
tin?  true  structure  of  the  points,  but  the  algorithm  then  oscillates  with  high  amplitude 
away  from  the  true  sedutiou.  When  the  error  iu  the  ineulol  is  high,  depth  reversals  often 
occur.  Fig-  5c  shows  an  example  of  the  true  and  computed  structure's  at  a  time  of 
complete'  depth  reversal.  Such  reversals  we’re  alse)  ediserved  with  the  discrete  formulation 
of  t(»e  scheme,  although  rarely. 

Fig.  G  illustrates  the  typical  behavior  of  the  eliscrete  formulation  of  the  iucre-me  ulnl 
rigielity  scheme’,  ave'raged  over  10  configurations  t>f  five'  points.  For  each  of  the  eouliguru- 
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Figure  5.  (a)  The  boat  solution  (open  circles)  obtained  over  10  revolutions  of  the  points  is 
compared  with  the  true  positions  (filled  circles),  for  the  case  of  tin*  continuous  formulation,  (b) 
The  error  in  the  internal  3  D  distances  between  points  in  the  model,  as  a  function  of  time,  (c) 
A  complete  depth  reversal  between  the  true  and  computed  structures. 


turns.  the  first,  point  was  placed  at  the  origin  of  the  coordinate  system  and  the  coordinates 
of  the  remaining  four  points  were  chosen  randomly.  The  set  of  points  was  then  rotated 
by  three  diserete  angular  steps,  with  the  size  of  the  angular  displacements  ranging  from 
1  to  90".  The  initial  3  1)  model  for  the  points  was  Hat,  and  a  new  model  was  computed 
for  each  of  tlie  three  discrete  positions  of  the  points.  After  the"  third  step,  we1  computed 
the  following  measure1  of  the1  absolute  erreir  in  the-  internal  distances  between  peiint.s1 
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where  dij  is  the  true  3  13  distance  between  points  i  and  j.  l7J  is  I  lie  distance  between 
points  i  and  j  in  the  computed  model  and  n  is  the  total  number  of  pairs  of  points. 
Tliis  measure  was  chosen  because  it  expresses  an  average  (>r  the  error  in  the  model 
relative  to  Urn  true  structure.  (Note  that  a  lower  value  for  this  measure  corresponds  to 
loss  error  in  tin*  computed  model.)  We  also  considered  other  measures  of  error  in  the 
distances  between  points  and  in  the  actual  depth  values,  and  found  the  general  behavior 
of  the  algorithm  to  bo  the  same  under  different  measures.  The  above  error  measure  was 
average!  over  the  10  configurations  of  points,  and  is  plotted  in  Fig.  f>.  as  a  function  of 
angular  displacement,  it  can  first,  be  seen  that  in  general,  the  error  after  thrive  discrete 
steps  of  the  algorithm  varies  with  the  size  of  the  angular  displacement  between  frames. 
There  is  a  steady  improvement  in  performance  as  the  displacement  increases,  to  about 
50",  followcsl  by  a  degredation  for  increments  of  GO  ,  and  a  sternly  decrease  in  performance 
from  70"  to  90".  The  degradation  in  performance  for  an  angular  displacement  of  GO"  was 
common;  8  of  the  10  configurations  of  points  exhibited  this  behavior.  This  degradation 
may  be  a  consequence  of  the  symmetry  between  the  initial  and  final  views,  which  are 
rotated  180"  from  one  another.  In  general,  the  convergence  of  the  algorithm  for  three 
discrete  steps  degrades  significantly  for  smaller  angular  displacements.  This  result,  is  not 
surprizing,  in  that  there  is  very  little  change  in  the  discrete  views  for  such  small  angles 
and  a  reduced  total  angular  extent.  The  deterioration  for  large  angles  probably  occurs 
because  at  90",  the  number  of  views  available  to  the  structure  from  motion  process  is 
reduced  to  two. 
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Fig.  7  illnsl rail's  the  error  in  t In*  computed  model,  as  a  liinrt ion  of  angular  dis¬ 
placement.  for  tin*  case  in  which  I  lie  overall  rotation  of  tlie  points  was  kept  essentially 
eonstanl.  Tin-  same  set  of  1(1  random  configurations  of  points  was  rotated  l>y  discrete 
angular  steps,  with  the  size  of  tin-  steps  varying  from  It)  to  90  .  In  this  ease,  each 
configuration  was  rotated  hy  a  total  of  (approximately)  180  and  .‘>00  ,  and  the  same 
measure  shown  in  Rep  (3.2)  was  then  computed.  This  measure  was  again  averaged  over 
the  10  configurations  of  points,  ami  is  plotted  as  a  function  of  angular  displacement  in 
Fig.  7.  Fig.  7a  shows  the  data  for  the  rase  of  180  of  rotation.  Note  that  for  tin*  angles 
40  .  50  .  70  and  80  ,  there  are  two  points  plotted,  corresponding  to  multiples  of  tin- 
angular  displacement  that  are  just  less  than  and  greater  than  180  .  The  graph  t.iat  is 
superimposed  on  the  points  passes  In-tween  the  pairs  of  points  for  these  angles.  Fig.  71> 
shows  t  he  satin-  data  for  the  case  of ‘100  of  rotation  (for  tin-  case  of  angular  displacements 
of  50  and  70  ,  t  he  points  were  rotated  hy  a  total  of  .‘550  ).  Also  shown  in  Figs.  7a  and 
7l>  ar<-  the  average  errors  in  the  solution  that  wore  derived  hy  the  continuous  formulation 
of  the  algorit  hm  after  180  and  300  of  rotat  ion  of  tin-  10  configurat  ions  of  points.  These 
two  data  points  are  indicated  hy  tin-  stars  along  the  ordinate  of  the  graph.  The  main 
observation  to  he  made  is  that  when  the  points  are  rotated  hy  a  constant  total  amount, 
there  is  still  a  strong  dependence  of  the  rate  of  convergence  of  the  solution  on  tin-  size 
of  the  angular  displacement,  between  frames.  There  is  again  an  improvement,  in  conver¬ 
gence  rate  as  this  angle  increases,  up  to  about  50' ,  followed  by  a  slight  worsening  for 
an  angle  of  60°,  then  improvement  again  for  70",  followed  by  a  steady  worsening  to  90°. 
The  degradation  in  performance  for  an  angular  displacement  of  00°  was  again  common, 
occuring  for  7  of  the  10  configurations  of  points.  The  deterioration  of  the  convergence 
rate  with  decreasing  angular  displacements  is  not  obvious,  because  in  spite  of  the  fact 
that  the  changes  between  consecutive  frames  .are  smaller,  there  are  many  more  frames 
altogether,  lu  addition,  while  the  continuous  formulation  provides  a  good  estimate  of  the 
true  solution  after  180  (see  Fig.  7a),  the  solution  then  degrades  significantly,  providing 
a  relatively  poor  solid  ion  after  300  of  rotat  ion  (see  Fig.  7b). 

To  conclude,  the  results  of  computer  experiments  with  the  discrete  formulation  of 
the  incremental  rigidity  scheme  show  a  clear  dependence  of  the  behavior  of  the-  computed 
solution  on  the  size  of  the  angular  displacements  between  frames.  In  the  limiting  case 
of  the  continuous  formulation,  there  is  an  initial  fast  convergence  of  the  solution  toward 
the  true  solution,  followed  hy  a  substantial  oscillation  of  the  solution.  The  next  sect  ion 
presents  a  theoretical  analysis  that  attempts  to  explain  this  phenomenon. 

3.2  Analytic  Study  of  Convergence  Properties 

In  this  section  we  out  line  the  main  conclusions  of  our  theoretical  analysis.  Tin-  purpose  of 
this  analysis  is  not  a  general  study  of  tin-  behavior  of  the  incremental  rigidity  scheme,  as 
such  an  analysis  would  be  too  cumbersome.  Rat  her  it.  concentrates  on  a  formal  analysis 
of  the  convergence  properties  of  t In-  algorithm,  for  a  family  of  examples  that  we  believe 
are  relevant  to  the  general  recovery  of  structure  from  motion.  We  emphasize  the  concepts 
raised  by  this  analysis,  and  comment  on  tlu-ir  generality.  This  section  considers  a  subset, 
of  those  examples  analyzed  in  the  computer  simulations  of  section  3.1. 


Ilian  I  In-  difference  between  tin-  /M  (/)  s  themselves.  thus  avoiding  llir  use  of  septan-  roots. 
Tliis  nie-asun-  also  <lo«-s  not  contain  tin-  <-iti »i<-  factor.  (/).  in  I  In-  de-iinmiunlur,  wliie-h  was 
incluele-d  in  tlx-  measure  proposed  |>y  (lllmaii  as  a  means  of  reducing  the  cont  rilmt  ions 
of  distant  points  relative  to  nearby  ones.  In  this  theoretical  analysis,  we  mainly  consider 
configurations  of  points  that  are  compact ,  in  the  sense  that,  the  internal  distances  bet  ween 
pairs  of  points  do  not  dither  much  from  one  another.  In  this  case,  tin-  cubic  factor  seems 
not  to  be  important. 

The  main  problem  that  this  theoretical  analysis  addresses  is  the  long  term  stability 
of  the  solutions  obtained  by  the  incremental  rigidity  scheme,  as  a  function  of  the  angular 
separation  between  successive-  frames,  for  the  case  of  rotation  of  rigid  objects  about  a 
single  axis  in  space.  In  the  previous  section,  we  observed  the  variation  of  convergence- 
rate  with  angular  displacement,  in  the-  results  e>f  computer  simulations.  In  the-  pre-se-nt 
analysis,  we-  e-xamine*  this  conve-rge-nee-  phe-neuue-nem  thremgh  a  stuely  of  the-  stability  of 
the-  incre-me-ntal  rigielity  scheme*,  fn  particular,  we-  analyze-  the-  be-havior  of  the-  inte-rnal 
model  unele-r  small  perturbations,  whe-n  it  is  near  the-  true  solut  ion.  We*  lirst  examine-  t lie- 
limiting  case-,  where  the  frame-s  are-  arbitrarily  e  leise-  t,e>  erne  another,  that,  is  whe-u  V  — *  t. 
The-  analysis  of  this  case  is  pre-se-nte-el  in  se-e  tiems  3.2.1  and  3.2.2.  We-  the-n  e-x|>le)re  the- 
cemve-rge-nce-  preipe-rtie-s  eif  the  eliscre-te-  fonnulatiem  in  se-ctiems  3.2.3  anil  3.2.4.  Finally,  in 
section  3.2.5  we-  summarize-  the-  theoretical  properties  of  the  two  schemes,  as  a  function 
of  angular  displacement. 

3.2.1  The  Continuous  Formulation 

In  the  e-ase  of  the-  continuous  formulation,  (£')  can  be  determined  to  a  good  first  orelor 
approximation  hy  itJ(<)  anel  its  derivative-,  that  is: 


MO  «  +  /,-,(<)  X  (t'  -  t).  (3.4) 

As  elescrihe-el  earlier,  whe-n  such  an  approximation  is  used,  rather  than  computing  the 
depth  values  2,(£')  that  maximize  rigidity,  we  can  compute  the  z  components  of  velocity 
i,(t)  that  elo  the-  same  task.  It  can  he  shown  that  in  the  limit,  a«  t'  — ►  t,  using  Eep  (3.3), 
anel  the  ch-linition  of  Z,y,  the-  quantity  to  be-  minimize-d  is: 

Dc{t)  ~  £]((*.•  -  -  ii)  +  (Vi  ~  ?/;)(?/>  -  Vj)  +  (zi  ~  ~  *;))2-  (3-5) 

»•> 

Tims,  give-n  the-  moele-l  at  time  f,  and  the  x  anel  y  components  of  velocity  in  the-  image, 
we  e-oinpute-  the*  te-mporal  ele-rivative-s  of  the*  ele-ptb  value-s  that  minimize  De(t)  give-n  by 
Eej.  (3.5).  This  is  the-  continuous  formulation  that  we  use-  in  the-  tbe-ore-tie al  analysis  of 
the  cemve-rge-nee-  properties  of  the-  incremental  rigielity  scheme. 

Note-  again,  that  the  ejuantit it-s  that  can  he-  computed  are  not  the-  absolute-  values  of 
il[t)  hut  the-  relative  value-s  i,(/)  - Zj(t).  It.  follows  that  without,  hiss  e>f  ge-ne-rality,  we-  can 
se-t  (r(|.  j/().  Co)  at.  (lie-  origin  e>f  the-  coordinate  system,  that  is  (xt>,  yi>.  ~n)  (0,0,0).  At 

e-ve-ry  instant,  the-  re-maiiiing  n  1  points  are-  then  given  relative  to  this  first  perint.  Thus 
tlie-  liumbe-r  of  imle-pe-nde-nt  variable's  ill  the-  iue>ele'l  is  n  —  L.  Intreielueing  the-  following 
lmt.atiemal  siniplilicatiem: 


«V  (**  -  *,)(*•  -  *j)  I  { ?/»  »/)(?'*  -  ?/,).  (3-G) 

Eq.  (3.5)  ran  thou  In*  rewritt.eu  as  follows: 

u  2  n  1  n  1 

l)At)  -  Y1  £(«.;  t  (a  -  -y)(^«  -  ~j))2  f  ^<“‘9  1  (:{7) 

*  i  j  »  j  i 

As  described  above,  wo  aro  looking  for  Uio  ii(l)  dial  minimize  D r(<).  Tlio  nocossary 
condition  for  such  a  minimization  is  that  tin*  partial  derivatives  of  Dr ( I. )  with  respect  to 
the  ij(<)  are  zero: 

i)Dr  ,  „ 

- 0  «*• 8) 

In  section  3.5  we  show  that  Eqs.  (3.8)  represent  a  set  of  n  l  linear  equal  ions  with  n  l 
variahles  ii(t),  which  generally  have  a  unique  solution.  It  follows  that  because  Dr(t)  >  0, 
thus  being  hounded  from  below,  the  i,(f)  that  satisfy  Eqs.  (3.8)  also  minimize  De\t ),  so 
that  these  equations  generally  represent  a  sutlicient  condition  for  the  minimization. 

The  ij(t)  that  satisfy  Eqs.  (3.8)  are  expressed  in  terms  of  £,(/.),  thus  representing 
a  system  of  n  -  1  differential  equations  with  n  —  1  variables.  This  system,  however,  is 
generally  diiticult  to  solve  explicitly,  as  it  is  nonlinear.  The  only  solutions  that  we  were 
able  to  vivify  by  straight  substitution  are  the  true'  motion  of  the  object  and  its  depth 
reversal  (which  are  equivalent,  under  orthographic  projection).  In  the  next  section  we 
present  a  stability  analysis  of  the  true  rigid  motion  solution;  that  is,  we  examine  the 
stability  of  the  algorithm  when  its  solution  is  close  to  the  true  solution.  (The  computer 
simulations  address  the  full  convergence  behavior  of  the  algorithm.)  We  concentrate  on 
rigid  motion,  because  we  feel  that  nonrigid  transformations  may  disrupt  the  stability 
of  the  solutions,  introducing  instabilities  due  to  peculiar  types  of  motion.  For  example, 
the  internal  model  may  be  too  sluggish  iu  adjusting  itself  when  fast  changes  of  st  ructure 
occur.  By  considering  rigid  motion  ip  this  analysis,  we  address  more  fundament  al  aspects 
of  the  theory  itself. 

3.2.2  Stability  Analysis  of  the  Continuous  Formulation 

The  idea  of  a  stability  analysis  can  be  stat  ed  as  follows.  Suppose  that  at  a  given  inst  ant  <0, 
the  computed  3  D  model  is  very  close  to  the  true  solution,  that  is  ~i(<u)  =  £»(!«)  4- 1 ,(<o) 
where  ii(f)  is  the  true  depth  value  at  point  i.  Because  the  system  is  perturbed  at  to,  we 
expect  it  iu  general  to  be  perturbed  for  every  t  >  to,  that  is  z,(<)  £,(<)  +  i,(t).  The 

system  is  said  to  be  asymptotically  stable  if  the  following  is  true: 


bin  r,(<)  -  0,  (3.9) 

(  >oo 

for  every  ».  If  the  < ,  ( < )  remain  bounded,  but  do  not  converge  to  zero,  the  system  is 
defined  to  lie  weakly  stable.  If,  however,  lim(  >00  «,(/)  -  oo  for  some  i.  the  system  is  said 
to  be  unstable. 


if  we  suppose  dial  <,(/)  is  small  enough  for  every  I  '  /( t.  I  lien  we  can  make  the 
following  iirsl  order  approximal ion  of  Eqs.  (.‘1.8)  in  terms  of  < ,(/)  and  *',(/)  (I hi'  argil menl s, 
t,  have  been  removed  for  simplicity): 


(  iPDr  <)>!),.  .  \ 
f-[  V  ‘ 3  diidi/V 


0. 


(3.10) 


A  derivation  of  Eqs.  (3.10)  is  given  in  Appendix  t..  IC< js.  (3.10)  ran  lie  readily  solved  for 
<  j{t)  in  terms  of  <3(l).  For  notational  simplicity,  we  write  this  solution  in  matrix  form: 


i)2Dr 

1 

i)2nc  ‘ 

<)zti)Zj 

where*.  -  (<  i . <„  t)r,  [«>2  l)r/i)zii)zj\  is  the  matrix  whose  element  at  row  t  and  column 

j  is  i)2  Dr/itZtifsj  and  \<)2  Dr/i)zJ)zj}  1  is  the  inverse  of  t  he  mat  rix  whose  element  at 
row  i  and  column  j  is  <)2 l)r/i)zii)zj<  provided  it  exists.  The  matrix  A  is  defined  to  be 
- \i)2 1)r/<)zti)zj\  l\i)2 De/<)zi<)zj\,  ami  is  evaluated  at  tin*  true  solution  itself  at  every 
instant. 

Using  the  Eqs.  (3.11)  to  study  the  stability  of  the  system,  it  is  not  possible  to  prove 
that  the  system  is  unstable.  To  do  so  requires  a  proof  that  r  is  unbounded  as  t  — »  oo, 
but  in  this  case  the  first  order  approximation  used  to  derive  Eqs.  (3.10)  no  longer  holds. 
It  is  still  possible  to  determine,  however,  whether  or  not  the  system  is  asymptotically 
stable.  Indeed,  we  will  show  in  this  section,  that  for  many  types  of  motion,  the  continuous 
formulation  of  the  incremental  rigidity  scheme  is  not  asymptotically  stable.  The  results  of 
computer  simulat  ions  provide  evidence  that  the  formulation  is,  in  general,  not  unstable, 
thus  being  weakly  stable. 

The  Eqs.  (3.10)  represent  a  system  of  ordinary  linear  differential  equations,  which 
are  used  in  conjunction  with  Eq.  (3.7)  whore  Dc(t)  is  defined.  Note  therefore,  that  A 
has  time  dependent  components.  How  do  the  different,  types  of  motion  determine  A?  A 
generalized  motion  of  a  rigid  body  can  be  described  instantaneously  as  a  rotation  about 
a  fixed  axis  in  spare  plus  a  translation  of  the  origin  of  the  coordinates.  We  noted  before 
that,  under  orthographic  projection,  all  that  can  be  determined  are  the  relative  depths  of 
the  points  and  not  their  absolute  values.  Also,  in  the  same  case,  the  only  relevant  data 
to  the  problem  of  recovering  structure  from  motion  are  the  relative  image  coordinates 
(terms  n,y  in  Eq.  (3.7)).  Translation  does  not  change  any  relative  distances,  as  it  changes 
the  positions  of  all  points  by  the  same  amount.  Thus  only  rotations  need  to  be  considered 
in  this  problem. 

As  mentioned  earlier,  the  matrix  A  in  Eq.  (3.1 1)  is  in  general  time  dependent.  Thus 
this  system  cannot  fie  solved  by  the  standard  method  of  characteristic  values,  available  to 
systems  with  constant  coefficients.  Furthermore,  the  general  rotation  can  have  variable 
angular  velocity,  and  t  he  axis  of  rotat  ion  can  change  over  time,  making  Eq.  (3.U)  very 
diiiicult  to  integrate  analytically. 

If  is  not  necessary,  however,  to  solve  Eq.  (3.11)  in  order  to  reach  certain  conclusions 
regarding  the  nonconvergonce  of  (  to  ().  For  example,  let.  r  .(t.)  he  n-  1  arbitrary  solut  ions 
of  Eq.  (3.11)  with  initial  conditions  <y(0)  specified.  Then  the  n—  I  Xn-I  matrix  E(/)  = 
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|< ,  (/) . «  „  |(/)j  satislies  llii'  Liouvillc  Jacobi  formula  ( Yakubovich  and  Slar/liinskii. 

11)75): 

Dd(E(t))  -  IM(E(U))<x},  Tr(A(l'))dt'Y  (3.12) 

It,  follows  that  a  mx-ossary  <*< Hulit-ioii  for  hup  .tJOE(/)  -  0,  that  is,  for  the  system  to  bo 
iisymptot  ically  stable,  is: 

f  Tr(A(l'))<U'  —  -oo.  (3.13) 

Jii 

Wo  found  that  condition  3.13  does  not  hold  for  many  interesting  types  of  motion. 
For  example,  for  any  three  point,  configuration,  rotating  with  arbitrary  angular  velocity 
about  an  axis  perpendicular  to  the  viewing  axis,  we  can  derive  the  following  relationship, 
by  using  Eqs.  (3.7)  and  3.11  (details  of  the  derivation  are  given  in  Appendix  ('): 


—I  ~  Zj~2  +  Z\ 


(3.14) 


where  itl  i2,  Z\ ,  £2  denote  the  z  coordinates  and  velocities  evaluated  at  time  t.  This  trace 
is  periodic  and  has  zero  integral  over  the  cycle,  for  rotations  of  fixi-d  anguhir  velocities 


Tr(A(t))dt  =  0, 


(3.15) 


a  result  that  contradicts  Eq.  (3.12).  Thus,  for  any  three  point,  configuration  have  this 
type  of  motion,  the  internal  3  D  model  computed  by  the  continuous  formulation  will  not 
converge  to  the  true  structure. 

The  fact  that  this  result  is  derived  for  throe  point  structures  is  nontrivial.  In  fact,  it 
is  shown  in  the  analysis  of  the  discrete  formulation,  later  in  this  section,  that  converging 
models  can  be  constructed  for  three  point  configurations  under  certain  conditions.  A 
detailed  analysis  for  structures  with  a  higher  number  of  points  is  in  general  very  cum¬ 
bersome,  but  we  verified  that  Eq.  (3.15)  holds  for  special  four  point  configurations, 
such  as  those  that  when  viewed  along  the  rotation  axis,  appear  as  squares,  rectangles  or 
trapezoids. 

We  also  note  that  having  the  rotation  axis  perpendicular  to  the  viewing  axis  is  the 
best  situation,  as  far  as  convergence  is  concerned,  because  depth  motion  is  lost  as  the 
rotation  axis  is  slanted  towards  the  viewing  axis  (that,  is,  away  from  the  image  plane). 
In  particular,  note  that  in  the  limiting  case,  where  the  rotation  axis  is  parallel  to  the 
viewing  axis,  no  motion  in  depth  occurs  whatsoever,  and  so  the  structure  cannot  he 
recovered  from  relative  motion. 

Is  the  instability  of  the  continuous  formulation  due  lo  properties  of  the  projections  of 
the  object,  at  particular  positions  in  its  revolution,  which  somehow  convey  less  informal  ion 
to  the  recovery  of  structure  from  motion  (for  example,  when  many  of  I  lie  points  belonging 
to  an  object  have  little  motion  in  depth)?  In  order  to  cheek  this  hypothesis,  we  explored 
the  convergence  of  the  interna!  niodtl  when  the  object  was  oscillated  hack  and  forth 
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over  a  small  angular  extent.  anmiul  different  positions.  It  was  found  again.  Iiy  checking 
tin-  validity  of  Eq.  (3.  If»),  that  regardless  of  position,  tin1  internal  model  is  unable  to 
converge  to  the  true  structure,  thus  giving  a  negative  answer  to  the  question  raised.  This 
suggested  to  us  that  tin*  reason  underlying  the  instability  of  the  continuous  formulation 
may  be  based  on  the  smallness  of  the  angular  displacement  betwivn  consecutive  frames, 
and  not  on  the  structure  of  the  frames  themselves.  Pursuing  this  idea  further,  the  next 
sections  address  the  convergence  behavior  of  the  discrete  (and  more  general)  formulat  ion 
of  the  incremental  rigidity  scheme,  focusing  on  the'  effects  of  the  si/e  of  the  angular 
displacements  between  frames. 


3.2.3  The  Discrete  Formulation 

The  continuous  formulation  was  derived  from  the  approximation  of  Eqs.  (3.3).  as  the 
angular  displacements  between  the  consecutive  frames  used  in  the  struct  tire  from  motion 
process  become  small,  and  it  was  expressed  in  Ecj.  (3.5).  If  we  do  not  make  such  an 
approximation  we  have  a  more  general  formulation,  which  is  however  discrete.  By  being 
more  general,  this  formulation  allows  a  study  of  the  incremental  rigidity  scheme  as  a 
function  of  the  angular  displacement  between  frames,  and  enables  us  to  verify  whether 
the  total  instability  found  in  the  continuous  formulation  is  due  to  an  “artifact”  of  the 
approximation  made,  namely  an  infinitesimal  angular  displacement  between  frames,  or 
rather  is  due  to  a  continuous  process  in  which  the  system  becomes  less  stable  with  smaller 
angles  and  eventually  becomes  unstable  when  the  displacement  is  small  enough. 

In  order  to  stress  the  discrete  nature  of  the  general  formulation,  we  first  rewrite  Eq. 
(3.3),  using  as  independent,  variable  t  =  kr,  where  k  is  an  integer  and  r  a  fixed  interval: 


1 


Ai(fcr)  -  ^2  [(**(fcT)  -  Mfcr))2  +  (ydkT)  -  yAkr))7  +  i2AkT)  -  zAkr))2 
»>y 

-  (^((fc  +  l)r)  -  Xj((k  +  L)r))2  -  (t u((k  +  l)r)  -  yj((k  +  l)r))2  (3,1C) 

-  {zi((k  +  l)r)  -  Zj((k  +  i)r))2]2. 

Note  that  ns  in  the  continuous  formulation,  the  quantities  of  interest  for  the  theory  are 
relative  rather  than  absolute,  i.e.  (z,((A:+  i)r)  —  Zj((k- f  f)r)).  Thus  again,  without  loss 
of  generality  we  can  set  (x(j,j/o,  ~o)  =  (0,0,0),  and  st  udy  the  behavior  of  the  other  n  —  1 
points  relative  to  it.  For  notatioual  simplicity,  let: 


Kj(kr)  -( Xi((k  T  l)r)  -  xj((k  h  l)r))2  +  (y,((A:  +  l)r)  -  Vj((k  +  l)r))2 
-  (xi(kr)  -  Xj(kT))2  -  (yi(kr)  -  ?/y(fcr))2. 


(3.17) 


We  now  rewrite  Eq.  (3.1G),  using  this  notatioual  simplification  and  without  explicitly 
specifying  r: 
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(3.18) 


In  the  present,  formulation  we  are  looking  for  t lie  z,(k  t  1)  Dial  minimize  l),t(k).  As 
in  the  continuous  case,  the  necessary  ami  sufficient  condition  for  this  minimization  is: 


if  Da 

iizt(k  I  1) 


0, 


(3.19) 


which  represents  a  set  of  n  1  nonlinear  equations  implicitly  relating  the  z,(k)  and 
z,(k  t  1).  The  only  instance  in  which  we  have  been  able  to  derive  a  full  analytic  solution 
for  these  equations  is  in  tl»e  two  point  case.  This  case  is  a  degenerate  one.  because  for 
two  points  the  3  D  structure  is  not  determined  uniquely  by  any  number  of  views.  It  is 
important  to  discuss  this  case,  however.  as  it  provides  a  relevant  insight  into  the  theory, 
which  is  discussed  further  in  section  3.2.5.  Eqs.  (3.18)  and  (3.19)  in  this  case  reduce  to: 


[*»i (*)  +  *?(*  +  1)  -  *?(*)]  ~t (k  +  l)  =  0.  (3.29) 

The  only  nontrivial  solution  for  this  equation  occurs  when  the  term  inside  the  brackets  is 
zero.  But  using  the  definition  of  6oi,  this  condition  essentially  implies  that  the  distances 
between  the  two  points  is  constant.  In  other  words,  the  distance  between  the  points  in 
the  initial  internal  model  at  k  =  0  will  tend  to  remain  the  same  throughout  the  entire 
motion  even  if  it  is  wrong.  The  model  will  only  correct  itself  if  its  initial  guess  for  the 
distance  between  the  points  is  small.  In  that,  case,  the  internal  model  will  be  forced 
to  change  because  the  distance  between  the  points  in  the  projected  plane  will  become 
larger  than  the  distance  in  the  model.  In  this  case,  numerical  calculations  show  that 
the  model  expands,  and  even  has  a  small  overshoot  such  that  the  distance  between  the 
points  becomes  a  little  larger  than  the  true  distance,  and  then  stays  at  this  condition 
forever. 


3.2.4  Stability  Analysis  for  the  Discrete  Formulation 

The  stability  analysis  for  the  discrete  formulation  begins  with  the  same  general  argu¬ 
ment  ns  for  the  continuous  case.  Suppose  that  at.  k  —  0.  the  model  is  very  dose  to  the 
true  solution,  i.e.  2,(0)  =  *(0)  +«,(0),  where  5,(A:)  is  the  true  depth  value.  We  exam¬ 
ine  whether  or  not  the  perturbation  at,  later  times  converges  to  zero,  that  is.  whether 
linifc  •oo <*'(&)  —  0.  If  this  is  the  case,  and  the  <,(Ar)  always  remain  small,  then  we  can 
write  an  equation  similar  to  Eq.  (3.11)  for  the  discrete  formulation: 
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Again.  I»y  I  lie  same  arguments  given  in  I  In-  <•< ml  ii mens  cases.  w<  <  an  not  use  IC«|.  (.‘5.2 1 )  to 
prove  tin-  system  to  l>e  unstable.  Imt  only  verify  wliellier  its  convergence  is  asymptnl  ic. 

In  this  analysis  we  ronrenl rale  on  three  point  configuration*.  heeanse  as  in  the  con¬ 
tinuous  formula!  ion.  an  analyt  ic  st  inly  wit  It  a  larger  nninher  of  points  is  too  cnmhersoinc. 
In  the  results  of  computer  simulations,  however.  we  found  that  the  results  derived  here 
seem  to  generalize  readily  to  con  lign  rat  ions  with  a  higher  nninher  of  points.  We  found  in 
all  the  cases  studied,  that  if  the  viewed  object  is  rotating  with  constant  angular  velocity 
around  an  axis  perpendicular  to  the  viewing  axis,  then  the  discrete  formulation  yields 
a  converging  internal  model,  hut  that  the  rate  of  convergence  depends  on  the  angular 
displacements  between  consecutively  viewed  frames. 

In  order  to  deline  the  concept  of  rate  of  convergence,  we  first  point  out  that  recursive 
use  of  Eq.  (3.21)  shows  t  hat  <[k)  can  he  readily  computed  from  <_(()): 

k  i 

L(k)  (3.2*2) 

j  o 

and  thus  the  convergence  of  <(k)  to  0  depends  only  on  this  matrix  multiplication,  rather 
than  on  the  perturbations  themselves.  Furthermore,  if  B(y)  is  cyclic  (as  it  is  for  rotations 
in  which  the  ratio  between  the  angular  displacement  and  2?r  is  rational),  with  cycle  m, 
convergence  depends  only  on  the  multiplications  of  the  matri  es  over  the  cycle,  which 
we  denote: 

m  1 

Bm  =  n  B(»  (3.23) 

j  0 

Interestingly,  the  behavior  of  B^,  hears  a  strong  similarity  with  geometric  sequences. 
Let  us  define  the  spectral  radius  of  Bm,  p(Bm),  as  the  maximal  modulus  of  the  charac¬ 
teristic  values  of  Bm.  Then  it  is  possible  to  show  (see  Appendix  D)  that: 

PJ(Bm)  =  p(B'm).  (3.24) 

From  this  result.,  one  can  understand  the  following  important  conclusion  (see,  for  exam¬ 
ple,  Varga,  I9G2): 

liin  BJm  -=  0  p( Bm)  <  1.  (3.25) 

j  >oo 

In  other  words  the  necessary  and  sufficient,  condition  for  the  internal  model  to  he  asymp¬ 
totically  stable  around  the  true  solution,  is  that,  the  spectral  radius  of  the  rotation  matrix, 
Bm,  is  less  than  I. 

Furthermore,  from  Eq.  (3.24),  one  secs  that,  the  spectral  radius  corresponds  to  a 
“time  constant.'’  by  which  we  can  measure  the  velocity  of  convergence  of  the  internal 
model.  In  fact,  as  p  becomes  closer  to  1,  more  revolutions  are  necessary  to  obtain 
the  same  amount  of  convergence  of  the  system,  and  the  spectral  radius  of  B?„  declines 
exponentially  with  j.  This  exponential  decline,  however,  holds  only  for  the  case  of  small 
perturbations,  from  which  Eq.  (3.22)  was  derived,  and  describes  only  a  general  trend,  as 
in  line  detail,  the  cyclic  matrix  B„,  is  composed  of  a  multiplication  of  partial  matrices 


t  hat  may  impose  a  “noise”  in  I  lie  decline.  This  noise  can  lie  seen  in  I  In'  si  innlal  ions 
shown  ill  seel  ion  .‘hi. 

The  dependence  of  I  lie  rale  of  convergence  of  (lie  inlernal  model  on  I  lie  angular 
displacement  between  viewed  frames,  d,  ami  ils  consequences,  are  heller  illustrated  by 
some  examples.  Fig.  8  displays  p  as  a  fiitietion  of  d  for  t  in1  ease  in  whic  h  I  lie  viewed 
object  appears  as  an  eipiil.  *eral  triangle,  when  viewed  along  I  lie  rotation  axis. 
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Angular  Displacement 


Figure  8.  Graph  of  the  spectral  radius  p  as  a  function  of  d  for  the  case  in  which  the  viewed 
object  appears  as  an  equilateral  triangle,  when  viewed  along  the  rotation  axis. 

The  first  result  of  interest  in  this  graph  is  that  p  <  1  for  every  angular  displacement 
d  such  that  0°  <  d  <  90°.  This  moans  that  for  almost  every  displacement  used  by  the 
algorithm,  the  true  solution  (up  to  a  depth  reversal)  is  an  asymptotically  stable  one.  This 
result  hold  for  ovory  configuration  tried,  provided  that  the  rotation  axis  was  perpendicular 
to  the  viewing  axis.  Thus,  in  spite  of  the  fact  that  the  true  solution  is  asymptotically 
stable  in  the  case  of  the  discrete  formulation,  in  the  limit  of  arbitrarily  closcx!  frames, 
that  is,  in  the  continuous  formulation,  tin*  true  point  ion  is  not  asymptotically  stable.  As 
can  he  seen  in  the  figure,  the  spectral  radius  tends  to  1  as  d  — >  0.  Thus,  in  the  limit, 
the  condition  expressed  in  Eq.  (3.25)  for  the  convergence  of  B',t  to  0,  is  not  fulfilled. 
This  confirms  onr  previous  results  regarding  the  lack  of  stability  for  the  continuous 
formulation. 

In  addition,  from  the  limit  of  1  at  small  displacements,  p  decreases  with  d,  showing  a 
steady  improvement  in  the  rate  of  convergence  of  the  internal  model  to  the'  true  solution 
with  displacements  up  to  CO".  This  observation  was  also  made  by  Ullinan  (1981).  The 
behavior  of  p  with  small  displacements  is  examined  more  closely  in  the  next  section.  In 
the  rest,  of  this  section  we  discuss  the  affect,  of  large  angular  displacement  s  on  the  spect  ral 
radius. 

For  large  displacement*,  the  rate  of  convergence  deteriorates  as  can  he  seen  in  Fig. 
8,  approaching  the  value  of  1  for  d  --  90  .  The  spectral  radius  is  I  Ibr  displacements 


of  00  because  only  two  <  li  IT*  'i<'i  1 1  views  of  I  In-  object  are  available  under  oil  hograpbir 
projection,  whirl)  is  too  little  informal  ion  lor  recovering  structure  from  motion  based  on 
the  assumption  of  rigidity.  This  result  appears  to  be  independent  of  other  properties  of 
the  moving  points. 

This  behavior  of  p  with  angular  displacement  can  l»‘  related  to  the  convergence  rate 
observed  in  computer  simulations.  This  is  illustrated  in  Fig.  9.  We  began  with  Un¬ 
set  of  three  points,  whose  true  coordinates  (when  projected  onto  the  A*  —  Z  plane)  lie 
around  an  equilateral  triangle,  as  shown  in  Fig.  9a.  The  y  coordinates  of  the  points  were 
nonzero  hut  small.  These  points  were  then  rotated  around  the  vertical  axis,  and  tin- 
error  measure  shown  in  ICq.  (3.1)  was  computed,  as  a  function  of  time.  Fig.  91>  shows 
an  error  plot  for  the  case  in  which  tlx-  angular  displacement  between  frames  was  20  . 
and  tin-  points  were  rotated  for  a  total  of  10  revolutions.  The  corresponding  graph,  with 
error  plotted  on  a  log  scale,  is  shown  in  Fig.  9c.  It  can  be  seen  that  on  a  log  scale,  the 
long  term  convergence  of  the  computed  .‘5  I)  model  is  approximately  linear.  A  measure 
of  convergence  rate  can  be  given  by  t  he  slope  of  the  line  that  best  fits  this  data,  in  a  least 
squares  sense.  In  Fig.  9d,  the  slope  of  this  line  is  plotted  as  a  function  of  the  angular 
displacement  between  frames  (tin-  three  point  configuration  was  rotated  for  a  total  of  10 
revolutions  in  each  case).  It  can  in-  seen  that  this  graph  is  qualitatively  similar  to  tin- 
graph  of  p  shown  in  Fig.  8. 

3.2.5  Convergence  Under  Small  Angular  Displacements  and  Instability  of 
the  Continuous  Formulation 

The  question  addressed  in  this  section  is,  why  doe  lie  rate  of  convergence  fall  with 
decreasing  angular  displacement  between  consecutive  frames,  eventually  leading  to  in¬ 
stability  for  the  case  of  tin:  continuous  formulation?  One  would  certainly  expect  poorer 
corrections  for  the  perturbations  if  smaller  displacements  are  list'd,  because  less  resolu¬ 
tion  is  available  in  the  image  data.  That  is,  the-  signal  to  noise  ratio  is  substantially 
reduced.  Many  more  views,  however,  are  seen  in  the  case  of  small  angular  displace¬ 
ments,  which  might  trade  olf  witli  the  deterioration  in  the  data  resolution.  One  must 
keep  in  mind  that  there  are  two  types  of  noise  that  can  influence  the  performance  of  the 
incremental  rigidity  scheme:  (1)  the  noise  from  the  image  data,  and  (2)  the  difference 
between  the  model  and  the  true  solution.  In  the  computer  simulations  presented  ear¬ 
lier,  the  image  measurements  wen-  computed  analytically,  and  hence  had  little  error,  so 
that  the  primary  source  of  noise  was  the  discrepancy  between  the  computed  model  and 
true  structure.  This  source  of  noise  contributes  to  the  deterioration  of  the  solution  with 
smaller  angular  displacements  between  frames.  From  the  results  in  the  previous  section, 
we  can  conclude  that  in  some  sense,  with  decreasing  displacement,  the  deterioration  due 
to  poorer  corrections  occurs  at.  a  faster  rale  than  the  improvement  with  the  number  of 
views.  We  explore  this  phenomenon  in  this  section. 

Formally.  we  can  express  the  above  question  as,  why  does  the  spectral  radius  p 
of  the  rotation  matrix  Bm  increase  witli  decreasing  angular  displacement  <1,  eventually 
tending  to  1  when  d  —  ►  0?  This  is  certainly  a  characteristic  of  the  curve  of  p  vs.  d 
curve  if  d  is  small  enough  (see  Fig.  8).  The  matrix  D,„  results  from  the  multiplication 


of  tin*  inal rices  corresponding  to  partial  displ.u emeiil s  (Eq.  (.‘5  23) )  One  would  |ik<‘  to 
define  some  quantity  related  to  these  partial  displacement  matrices,  which  expresses  I  In- 
del  eriorat ion  of  their  information  content,  and  to  study  the  connection  of  this  quantity 
to  the  general  spectral  radius  as  a  function  of  d.  This  quantity  and  its  relationship  to 
the  spectral  radius  were  found  hy  numerical  experimentation.  The  intuition  underlying 
these  observations  is  what  follows. 

(liven  three  point  configurations  and  fixed  angular  displacements,  an  inspection 
of  the  partial  displacement,  matrices  shows  that  they  all  have  exactly  the  same  two 
characteristic  values,  cj  and  c2 .  which  are  real,  and  0  <  cj  <  1  and  1  <  e2.  If  these 
ehnrart  eristic  values  belonged  to  a  diagonal  matrix,  then  tin*  effect  of  such  a  matrix  would 
be  to  change  the  perturbation  vector  <  (<  i.<2)r  into  (cp  | , e2< 2 )•  In  general,  however, 

the  vector  <  can  point  in  any  arbitrary  direction,  and  the  effects  of  a  diagonal  matrix 
on  the  length  of  »  would  be  different  from  direction  to  direction.  Thus  a  meaningful 
quantity  belonging  to  these  matrices  must  1m*  some  average  of  the  effect  of  the  matrix 
over  all  directions  of  the  perturbation  vector.  An  important  example  is  the  mean  length 
squared  of  the  vector  resulting  from  the  application  of  the  matrix  to  the  general  unitary 
vectors,  (cos'),  sin')).  This  quantity  is  the  following: 


cos2  (7)  +  4sm-(7))d7  =  -H— 


2  ,  2 
_  C,  +c| 


(3. 20) 


The  partial  displacement  matrices  are  in  general  not  diagonal,  hut  if  d  is  small 
enough  they  can  be  closely  approximated  as  orthogonally  similar  to  the  diagonal  matrix 
having  t*i  and  c2  as  its  elements.  If  we  begin  with  an  arbitrary  perturbation  vector, 
these  matrices  will  rotate  the  vectors  and  transform  the  resultants  as  described  above, 
and  then  repeat  this  transformation  until  a  cycle  is  completed.  Thus  for  small  enough 
displacements,  one  can  expert  the  diagonal  matrices  to  work  on  perturbation  vtvtors  of 
almost  all  directions,  tints  having  an  average  effect  as  describee!  in  the  last  paragraph. 
Generally  this  mean  effect  always  represents  a  decrease*  in  the  perturbation  ve*ctor,  which 
is  applie*d  2ir/d  times  during  a  cycle  of  revolution  e>f  the  object.  Thus  for  small  e*nough 
displacements  one  can  e*xpect  a  connection  between  eq,  c2,  p  and  d  giv<*n  by  an  equation 
of  the  following  fetrin: 


m  - 


(3.27) 


Eq.  (3.27)  close'ly  matches  the  behavior  of  the  spectral  radius,  for  the  equilateral 
triangle  case  (Fig.  8)  for  d  <  20''.  For  other  con  figurations,  relationships  very  similar  to 
this  one  were  observed  to  apply.  The  importance  of  Eq.  (3.27)  is  that  it  relates  a  quantity 
that  can  he  computed  in  any  partial  displacement  matrix,  namely  (c2(d)  +  c^(d))/2  (the 
deterioration  factor),  to  the  spectral  radius  of  the  full  rotation  matrix,  and  it  enables  11s 
to  reduce  the  problem  of  why  rhe  continuous  formulation  is  unstable  to  the  question  of 
how  fast  do»*s  the  deterioration  factor  tend  to  1  as  d  -+  0.  In  fact  one  can  see  from  Eq. 
(3.27)  that  a  necessary  condition  for  lim,(  ,(l  p(d)  1  is  that  : 
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This  condition,  however,  is  not  snllicient.  For  example,  if  the  deterioration  factor 
approaches  1  linearly  with  d,  i.e>.  (r'j(d)  I  s«  I  o <1  for  small  <1.  then  because  of 

the  power  2n/d  in  ICej.  (.3.27)  we  would  have  lim,<  .o p(d)  <  2n"  <.  I.  We  expect  then, 
that  the  deterioration  factor  approaches  1  his  ter  than  linear.  This  is  indeed  the  case,  as 
can  he  seen  by  expanding  (<  f(d)  +  i:|(d))/2  into  the  first  few  powers  of  its  Taylor  series. 
For  any  three  point  configuration,  this  expansion  yields: 
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(3.29) 


This  quadratic  approach  of  the  deterioration  to  1  is  fast  enough  to  ensure  that  linqj  ,(>  ■■ 
1,  as  eat i  he  seen  hy  substituting  Ftp  (3.29)  into  JC< | .  (3.27)  and  taking  the  limit. 

Therefore,  we  confirm  that  the  instability  of  the  continuous  formulation  is  due  to 
the  fact  that  the  deterioration  in  the  data  resolving  power  as  a  result  of  the  small  angular 
displacements  between  frames  is  faster  (quadratic  with  the  inverse'  of  the  displacement) 
than  the  increase  of  information  due  to  the  increase  in  the  number  of  frame's  (line-ar  with 
the  inverse'  of  the  displae*eine>nt).  This  analysis  pre>viele>s  theoretical  support  for  the1  etb- 
servation  that  for  the  case  e>f  the  discrete  formulation  e»f  the  inere'inental  rigidity  scheme, 
the  rate  of  convergence  of  the  internal  moiled  to  the*  true  solution  diminishes  as  the*  an¬ 
gular  displacements  between  frames  decreases.  In  the*  limit  of  ve*ry  small  displacements, 
that  is,  in  the  case  of  the  cemt.inuous  formulation,  this  ele'terioration  leads  to  a  lack  of 
asymptotic  convergence  of  the  model  to  the  true  structure. 


4.  Perspective  Formulations  of  the  Incremental  Rigidity  Scheme 

In  this  section,  we  first  pri'sent  eliscrete  arid  continuous  formulations  of  the  incremental 
rigielity  scheme  that  use  perspective  projection  of  the  se  em'  onto  the*  image*  plane.  We 
then  present,  a  few  examples  of  the  results  of  computer  simulations,  in  which  the  per¬ 
spective  formulations  were  applie'd  to  sequence's  of  points  undergoing  both  pure  rotation 
alxtut  a  single*  axis  in  space  and  pure  translation  through  spae'o.  The  behavior  of  the 
perspective  formulation  of  the  incremental  rigielity  scheme  is  more  complex  than  that 
of  the  orthographic  formulation.  We*  found  that  if  the  absolute*  position  of  an  obje  ct 
in  space  is  knetwu  throughout,  the*  motion  of  the  object,  then  the  perspex-five*  fort ni da¬ 
tum  performs  well,  similar  to  the  orthographic  formulation.  If  the*  absolute  position  of 
the  object,  is  unknetwn.  the*  incremental  rigielity  scheme*  in  ge*ne*ral  does  nett.  derive  the* 
correct  3  D  structure.  The  re*sult.s  of  computer  simulatiems  re*ve*ale*el  a  de*graelatie>n  in 
performance  with  smaller  angular  and  spatial  elisplaee-ments  between  frame's,  but  Ibis 
ele'gradatiem  w;is  seuuewhat.  metre  severe*  than  in  the*  orthographie  fonmdation.  In  the* 
limit  etf  the*  continuous  feirmiilatiem,  the  sedution  is  no  longer  stable*. 

In  both  the*  discrete  and  continuous  formulations,  we*  assume*  that  the*  positive*  z  axis 
petints  in  the*  direction  of  the  optical  axis,  with  the*  image*  plane*  at.  z  ■  1.  as  shown  in  Fig. 

10.  Fetr  netfatiemal  eetiive-iiienee*.  we  le*(  (z,  (/.).</,(/),  z,(f))  represent  the'  true'  ceteirelinate's 


4.1  The  Discrete  Formulation 


In  tli<*  discrete  case,  we  compute  the  depth  values  that  minimize  the  measure  of 

rigidity,  Dj(t,  t')  given  by  the  following  (for  notational  simplicity,  refer  to  the 

coordinates  of  the  {mints  at  time  t  in  the  computed  model,  while  ( x' , j/'- , z't )  refer  to  the 
coordinates  of  the  points  at  a  later  time  t'): 


d*(m')  =  DM0  -  MO)2 

=  ]C  [((*»■ -  m2  +  (»•■  -  ?o2  +  (“*  -  m2)  * 

-  *')2+(j»;  ->})’  i- u:  =;  i2);f . 


(42) 


This  measure  is  the  same  as  that  used  in  (lie  orthographic  fonimlal  ion  (i.e..  it  is  derived 
from  I0« | .  (2.1)  with  the  /jb(£)  factor  in  the  donomoiwdor  omitted,  as  slmwu  in  I0q.  ( A I ) 
of  Appendix  A). 

As  before,  we  assume  that  we  have  a  current  model  (x,(/.), y/,-(t) ,zt(l)).  llather  than 
assuming  that  x,(f')  ami  y,(t')  are  known,  however,  we  assume  that  only  «,(/')  and  »»,■(/.') 
are  known  (llu*y  are  derived  from  the  true  spjire  coordinates  at  lime  /'),  The  coordinates 
of  the  points  in  space  at  time  l'  for  the  computed  model  are  then  given  implicitly  by: 

*.-(<')  - 

».(/')  -  ».•(<')*(*')•  (d.;i) 

lly  substituting  Eqs.  (4.3)  into  Eq.  (4.2),  />,*(/,/.')  can  be  expressed  in  terms  of  tin' 
coordinates  of  the  points  as  follows  ((w',w|)  denote  the  imago  coordinates  at.  time  t')\ 

Aj(M')  =  Y  [((x*  ~  xi)2  +  “  Vj )2  +  (*»  “  *j)2)' 

(4.4) 


After  the  values  £,(<')  that  minimize  Da(t,  t')  are  computed,  Eqs.  (4.3)  are  used  to  derive 
the  space  coordinates  Xi(t')  and  yi(t').  The  new  3D  coordinates,  (xt(£'),  y,(i'),  Zi(l')), 
then  become  the  current  3  D  model,  a  new  frame  in  the  sequence  is  registered,  and  the 
process  repeats  itself. 

In  the  case  of  perspective  projection,  3  D  structure  can  be  recovered  from  relative 
motion  only  up  to  a  multiplicative  scale  factor.  This  is  clear  by  inspection  of  Eqs.  (4.1); 
a  constant  scaling  of  the  space  coordinates  (x;(f),  y,(£),  2,(t))  does  not  change  the  image 
coordinates  («,(<), t>,(<)).  In  most  of  the  computer  simulations,  we  assumed  that  the 
overall  scale  factor  is  known;  this  is  equivalent  to  assuming  that  the  absolute  position  of 
the  object  in  space  is  known.  In  the  simulations,  the  absolute  coordinates  of  one  of  the 
n  points  in  motion  was  supplied  to  the  algorithm  and  the  coordinates  of  the  remaining 
n  —  1  points  were  computed  relative  to  the  known  point  (similar  to  the  simulat  ions  of 
the  orthographic  formulation).  If  the  absolute  position  of  the  object,  in  space  is  not 
known,  then  the  scale  of  the  computed  3  D  model  depends  on  the  choice  of  the  initial 
configuration.  For  example,  suppose  that  we  assume  that,  the  set  of  moving  points 
initially  lies  on  a  plane  that  is  parallel  to  the  image  plane.  The  scab'  of  the  computed 
3  D  model  will  then  depend  on  where  this  initial  plane  is  placed  in  depth.  That  is,  if 
the  plain?  is  positioned  twice  as  far  from  the  i magi'  plane,  the  computed  3  I)  model  will 
essentially  be  twice  jus  large.  We  will  briefly  mention  the  results  of  computer  simuhitions 
in  which  the  absolute  position  of  the  object  in  space  is  unknown.  In  those  simulations, 
when  wo  began  with  a  Hat  initial  configuration,  we  usually  placed  the  initial  z  <’oordiiud.es 
of  the  points  at.  a  depth  that  wjus  approximately  the  mean  of  the  true  z  coordinates  of 
the  points  in  spare.  Further  details  of  the  implementation  of  the  discrete  formulation 
can  be  found  in  Appendix  A. 


4.2  Tli<‘  Continuous  Formulation 


l 


in  tin*  continuous  case,  we  compute  (lie  z  components  of  velocity  that  minimize  the 
iiKVisurc  of  rigidity,  />,.(/.),  given  by  the  following  (the  coordinates  and  velocities  arc1  all 
measured  at.  time  l): 


DAI)  -■  E 


»o 


K*«_;  *>)(*../  M ;*  (vi. :  »>)..•  .(5*.;.  ?>)(* 

(x,  -  Xy)2  f  (;(/,  -  ;vy)2  f  {3i  -  ^j2 


(4-5) 


This  measure  is  the  same  as  that  used  in  the  orthographic  case  (see  Eq.  (2.7)).  It  is 
assumed  t  hat  a  current  model  of  t  he  coordinates  of  I  he  points  in  space  is  given,  along  with 
the  known  coordinates  and  velocities  of  the  points  in  the  image  plane,  ul{L).v,(t).ui(t).v,(t) 
The  velocities  of  the  points  in  space  for  the  computed  model  are  then  given  implicitly 
by: 


Xt(t)  -  Mt(<)i,(t)  +  Ui(t)Zi(t) 

=  Vi(t)zi{t)  +  Ui(t)^(t).  (4.G) 

If  we  let.  Xij  =  Xi  -  x.j.  yij  =  yt  -  yj ,  =-  zt  -  Zj.  and  zt3  =  i,  -  zj,  tin'll  Dr(t)  is 

expressed  in  terms  of  the  coordinates  of  the  points  as  follows: 


D  (t)  =  Y'  +  ~  ~  iLizi)  +  +  i>jSj  ~  VjZ]  ~  VjZj)  +  ZjjZij 

^  r?  4-  ii2.  4-  ?2. 


*,7 


X*  4-  V  4-  2- 

tj  1  *v»7  xj 


(4.7) 

In  other  respects,  the  continuous  formulation  is  similar  to  the  discrete  formulation.  A 
model  of  the  structure  of  the  moving  object  is  built  up  by  continually  taking  into  account 
new  velocity  information  over  an  extended  time  period.  After  the  i,(<)  are  computed 
by  minimizing  Eq.  (4.7).  Eqs.  (4.6)  are  used  to  derive  i,(<)  and  ?/,(<),  which  are  then 
ust'd  to  compute  the  new  3  D  model.  Again,  because  perspective  projection  is  used,  the 
structure  of  the  object  can  only  be  computed  up  to  a  multiplicative  scale  factor.  Further 
details  of  the  implementation  of  the  continuous  formulation  are  presented  in  Appendix 


A. 


4.3  Computer  Simulations 

In  this  section,  we  present  some  results  of  the  computer  simulations  of  the  discrete 
and  continuous  perspective  formulations  of  the  incremental  rigidity  scheme,  for  a  small 
number  of  examples.  We  first  consider  the  case  in  which  a  set  of  discrete  points  is 
translated  through  space,  and  then  consider  the  rotation  of  a  set  of  points  .about  a 
vertical  axis.  Through  simulations  of  tin*  discrete  formulation,  we  examine  the  rate  of 
convergence'  of  the  algorithm  and  the  qualify  of  the  solution  that  it  yields,  as  a  function 
of  the  size  of  the  angular  and  spatial  displacements  between  frames.  We  then  observe 
the  limiting  behavior  of  the  continuous  formulat  ion  for  the  case  of  rotation.  We  also  note 
the  difference'  in  performance  of  the  perspective  formulations  when  the  absolute  position 
of  the  moving  points  in  space  is  either  known  or  unknown. 


II  sheiulel  !><■  st  ressed  that  in  l  li<*  e;ise  of  «»rl hngraphir  |>mjecl  inn.  il  is  not  possible 
l<>  recover  3  I)  structure1  for  objects  undergoing  pure-  I ranslal ion,  because  t lie  relative 
positions  of  preije'cted  points  in  the  image  do  not  change  as  objects  translate.  The  use  of 
perspective  projection  has  the  advantage  of  allowing  the  recovery  of  structure  for  trans¬ 
lating  objects.  In  the  simulations  presented  here,  objects  were  oscillated  back  and  forth 
in  the  direction  parallel  to  the  image  plain1.  If  the  objects  were  allowed  to  translate 
in  one  direction  for  a  largo  extent,  the  projected  image  or  the  points  would  eventually 
heroine  so  simdl  that  the  recovery  of  structure  would  become  diltieult  due  to  a  loss  of 
numerical  accuracy  in  the  measurement  of  the  changing  positions  of  the1  points.  Oscil¬ 
lating  the  points  allows  us  to  analyze1  the-ir  structure1  e>ve>r  an  e'xte-ueled  time1  perieiel  while1 
maintaining  a  relatively  large*  image. 

In  the  iirst  se>t.  e>f  example's,  the  input  consisted  e>f  a  sed  e>f  six  points  in  space.  The1 
coordinates  e>f  the  points  were  (‘host'll  randomly  and  the  coordinates  e»f  erne*  e»f  the-  points 
was  kimwn  to  the1  algorithm.  The  projections  of  the*  peiints  eint.ei  the*  X  Y  and  X  Z 
plane's  are1  shown  in  Figs.  1  la  and  lib,  respectively.  This  sed  eif  peiints  was  oscillated  bark 
and  forth  in  a  direction  paralle-1  t.e>  the1  image1  plane1.  The1  X  -  Z  projection  (birels’  e-ye- 
vie-w)  eif  the1  rightmeist  and  left  most,  peisit.iein  eif  the  peiints  in  spare  are*  shewn  in  Figs.  1  Ic 
and  llel,  respectively.  The  x  coorelinates  eif  the*  peiints  in  Figs,  lie  and  1  lei  are*  shifted 
by  60  units  freun  t.lie  initial  peisit.ieins  shewn  in  Fig.  lib.  In  the  simulations,  the  initial 
points  shown  in  Fig.  lib  were1  first,  translated  to  the*  right  in  cemstant  discrete  steps  ami 
then  translated  to  the  le'ft,  with  the  same1  discrete  stops.  A  3  D  mealed  was  built  up  over 
sovecal  oscillatiems  eif  the  points.  The  initial  mealed  for  the  structure  eif  the1  points  was 
Hat,  with  the  z  ceairdinate's  placeel  near  the1  mean  of  the*  true1  z  coordinates  of  the  points. 
Feir  the  particular  set  of  peiints  used  in  these  example's.  zl(L)  —  80,  i  —  1, ...,  n,  in  the 
initial  3  D  model.  Once  the  initial  Zi(t)  are  specified,  the  initial  x,(t)  and  ;</,(<)  can  be; 
determined  from  the  image  coordinate's  Ui(t)  ami  v,(t).  The  eme  point  whose  position  is 
known  throughout  the  oscillation  eif  the  points  is  indicated  by  an  arrow  in  Fig.  12a. 

Fig.  12  shews  a  birels’  eye  view  of  the  computed  3  D  mealed  after  1  (loft  column), 
4  (inielellc  column)  anel  10  (right  column)  complete  oscillations  of  the  points,  using  three 
different  sizes  of  spatial  displacements  between  frame's.  The  true  3  D  structure  of  the 
points  is  shown  again  in  Fig.  12a.  In  the  case  of  Fig.  12b,  the  size  of  the  spatial 
displacements  between  frames  was  60,  so  that  each  complete  oscillation  consisted  of  four 
frames  (i.c.,  the  points  we>re-  displaced  to  the  right  in  one  ste'p,  returned  to  the  central 
position,  displaced  t,e>  the  left,  in  erne  step  and  then  returned  again  to  the  central  position), 
hi  the1  case  of  Fig.  L2c,  the  points  were  translated  in  steps  of  30.  so  that  a  comple  te 
oscillation  of  the  peiints  ('(insisted  e>f  8  frames.  Finally,  in  the  case  of  Fig.  12d,  the  points 
we're  translated  in  steps  of  5,  so  that  each  full  oscillation  cemsisted  eif  48  frame’s.  In  Fig. 
12c,  vve  snow  plots  of  the  error  in  the  computed  nieieleds  as  a  funediem  of  time,  for  10 
complete  oscillations  eif  the-  points.  The1  e>rre>r  mevisuro  used  he-re*  is  the1  same  as  that,  used 
in  the  orthographic  case,  shown  in  Eej.  (3.2)  The  error  plots  for  spatial  displacements 
eif  GO,  30  anel  5  are  shewn  superimposed,  with  the1  displacements  indicated  above  e-aeli 
curve.  Freun  the  results  shown  in  Fig.  12,  it  can  first  lie  se-e  n  that  the-  algorithm  eleie's 
essemtially  cemvergc  to  the  true1  structure  eif  the1  peiints.  Seeeiml.  the-  rate  of  e-enverge'iiee 
eif  the1  e'omputcel  nieided  tei  the1  t  rue1  solutiem  devreases  with  smaller  spatiai  displacements. 


Fig.  13  illustrates  the  behavior  of  the  perspective  formulation,  for  the  case  of  rotation 
of  the  points  about  a  central  vertical  axis.  The  set  of  six  points  shown  in  Fig.  11  was 
rotated  about  a  seventh  point  that  was  placed  at  the  position: 

»<(*).  *i(*))  =  (M,  80). 

The  axis  of  rotation  passed  through  this  point  and  was  perpendicular  to  the  X  Z  plane. 
The  position  of  this  central  point  was  known  to  the  algorithm  throughout  the  motion  of 
the  points  and  the  positions  of  the  remaining  six  points  were  computed  relative  to  this 
central  point.  As  before,  the  initial  configuration  was  Hat,  with  Zi(t)  =  80,  i  —  l 

The  set  of  7  points  was  rotated  in  angular  steps  of  40°  and  5°.  Figs.  13b  and  13c 
show  the  computed  3  D  model  after  total  rotations  of  40°,  80",  120  and  3G0  (indicated 
on  the  left),  for  angular  displacements  between  frames  of  40  and  5  ,  respectively.  The 
true  positions  of  the  points  are  shown  in  Fig.  J3a.  In  Fig.  13d.  the  graphs  of  the  error 
in  the  computed  models  are  shown  as  a  function  of  total  angular  rotation,  for  three 
full  revolutions  of  the  points.  It  can  be  seen  that  the  rate  of  convergence  and  quality 
of  the  3  D  structure  decreases  with  the  smaller  angular  displacement  between  frames. 
Although  similar  to  the  case  of  orthographic  projection,  the  degradation  in  performance 
with  smaller  angles  was  somewhat  more  severe  in  the  case  of  perspective  projection. 

Fig.  14  illustrates  the  behavior  of  the  continuous  formulation  of  tin*  incremental 
rigidity  scheme,  for  the  case  of  perspective  projection.  The  same  .  et.  of  points  shown 
in  Fig.  11  was  rotated  about  the  vertical  axis  ami  the  3  D  model  was  computed  at 
infinitesimally  closely  spaced  frames,  using  the  instantaneous  velocities  projected  onto 


Figure  12.  (a)  Projection  onto  the  X  ~  Z  plane  of  the  true  3  D  structure  of  the  set  of  points. 
(1>)  Tlie  computed  3  D  model  iifler  i  (left  column),  4  (middle  column)  and  JO  (right  column) 
complete  oscillations  of  the  points,  for  a  spatial  displacement  between  fnuues  of  size  GO.  (c) 
and  (d)  The  same  three  computed  3  D  models,  for  spatial  displacements  of  size  30  and  5, 
respectively. 
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t lie  image.  As  in  tin*  previous  rotat ion  examples,  a  seventh  point  was  placet!  al  I  lie  center 
of  rotat  ion  of  the  points,  and  the  position  of  this  cent  ral  point  was  known  to  tin1  algnril  Inn 
throughout  tin*  rotation  of  the  points.  In  Fig.  14a,  we  compare  the  true  |>ositions  of 
the  points  (left)  with  the  best  solution  (right)  obtained  over  four  full  revolutions  of  t ho 
points.  Although  the  model  is  quite  dose  to  the  true  structure  at  this  position  of  the 
points,  the  solution  oscillates  significantly  over  an  extended  time  period,  similar  to  the 
orthographic  case.  A  graph  of  the  error  in  the  computed  mode)  over  the  four  revolutions 
is  shown  in  Fig.  14b.  The  arrow  marks  the  point  at.  which  the  solut  ion  shown  in  Fig.  14a 
was  obtained.  There  is  an  init  ial  fast,  convergence  toward  the  true  struct  ure  of  t  he  points, 
but  the  algorithm  then  oscillates  with  high  amplitude  away  from  the  true  solution.  We 
conclude  that,  similar  to  tin*  orthographic  case,  the  direct  use  of  velocity  information  for 
the  recovery  of  structure  by  the  incremental  rigidity  scheme  does  not  load  to  a  robust, 
solution. 

Wo  should  finally  note  that  wo  also  explored  some  examples  in  which  no  information 
about  the  absolute  position  of  tin*  points  in  spare  is  known.  In  this  case,  wo  began  with 
an  initial  configuration  that  was  Hat,,  and  otherwise  provided  no  further  constraint,  on 
the  positions  of  the  points.  The  points  were  either  oscillated  back  and  forth  in  the 
direction  parallel  to  the  image  plane,  or  rotated  about  a  vertical  axis.  We  found  that 
the  incremental  rigidity  scheme  would  sometimes  derive  the  correct  solution  under  these 
conditions,  but  in  general,  the*  algorithm  did  not  derive  the  correct  structure,  although  the 
computed  solution  was  essentially  rigid.  The  computed  solution  was  not  just  a  scales! 
or  rotated  version  of  the  true  structure,  hut  actually  a  different  one  altogether.  This 
suggests  that  some  additional  constraint  may  be  required  to  derive  the  correct  solution. 

To  summarize,  tin*  perspective  formulation  of  the  incremental  rigidity  scheme  ap¬ 
pears  to  perform  well  when  the  absolute  position  of  the  object,  in  space  is  known.  This 
formulation  then  allows  the  recovery  of  struct  ure  for  objects  undergoing  pure'  translation 
through  spare,  as  well  as  rotation.  The  results  of  computer  simulations  revealed  a  degra¬ 
dation  in  performance  with  smaller  angular  and  spatial  displacements  between  frames. 
This  degradation  appeared  to  he  more  severe  than  in  the  case  of  orthographic  projection, 
when  the  points  wore  rotated  about  a  vertical  axis.  When  the  points  were  translated, 
there  was  a  more  gradual  decrease  in  the  rate  of  convergence  and  quality  of  the  computed 
3  -D  model  with  smaller  spatial  displacements.  In  the  limit  of  the  continuous  formulation, 
the  solution  was  no  longer  stable. 


v 


5.  Summary  and  Conclusions 

In  this  paper,  we  studied  and  generalized  (he  incremental  rigidity  scheme  for  the  recovery 

of  structure  from  motion.  This  algorithm,  as  first  proposed  by  Ullman  (1984).  assumed 

the  visual  input  to  consist  of  a  sequence  of  frames,  each  containing  a  finite  number  of 

points.  The  scheme  maintained  an  internal  model  of  the  structure  of  the1  viewed  object, 

which  was  updated  from  frame  to  frame,  so  as  to  he  consistent  with  the  changing  image 

and  to  he  as  rigid  as  possible.  This  internal  model  was  shown  to  correctly  converge  lo  > 

tin’  true  structure  of  the  object.,  for  both  rigid  and  nomigid  motions. 


Figure*  13.  (a)  The  true  positions  of  tlie  points  after  rotations  l>y  40,  80  ',  120'  and  300°. 
(h)  Tlie  eoinpiiteil  3  l)  model  after  rotation  of  the  points  by  40",  80",  120"  and  300  ’,  for  an 
angular  displacement  bet  ween  frames  of  40'  .  (r)  The  same  computed  3  I)  models,  for  ;m  angular 
displacement  of  si/.c  5".  ((d)  on  next  page.) 
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Figure  13.  (<1)  Graphs  of  the  error  in  the  distances  between  points  in  the  computed  3  I)  model, 
as  a  function  of  time,  for  three  full  revolutions  of  the  points. 


In  the  present  paper,  we  focused  on  an  observation  math'  by  IJllman,  that  the  rate 
of  convergence  of  the  internal  model  to  the  true  structure  increased  with  the  amount 
of  change  between  consecutive  frames.  A  major  part  of  our  analysis  focused  on  objects 
that  rotated  with  constant  angular  velocity  around  an  axis  perpendicular  to  the  viewing 
axis,  and  whose  image  was  formed  through  an  orthographic  projection,  at  equally  spaced 
angular  displacements.  The  complete  dependence  of  the  convergence  rate  on  the  size  of 
the  angular  change  was  obtained  for  a  large  variety  of  objects. 

Tillman's  observation  was  confirmed  for  small  displacements.  Indeed,  the  conver¬ 
gence  rate  first  increased  towards  a  maximum  and  then  decreased,  as  a  function  of  the 
angular  displacements  bet  ween  the  frames.  The  deterioration  of  the  algorithm  for  high 
displacements  can  be  understood  if  we  note  that  the  number  of  frames  available  for  anal¬ 
ysis  in  a  given  total  number  of  revolutions  of  the  object,  or  in  other  words,  the  amount, 
of  information  provided  to  the  scheme,  is  reduced  in  inverse  proportion  to  the  angular 
changes  between  the  frames. 

The  deterioration  of  the  algorit  hm  for  small  displacements,  however,  has  a  more 
complex  and  surprizing  basis.  In  this  case,  although  the  number  of  frames  used  in  a 
given  amount  of  rotation  is  high,  the  spatial  resolution  between  the  points  is  low  (see 
discussion  on  the  beggining  of  section  3.2.5).  Our  analysis  indicates  that  the  latter  effect, 
dominates  for  v«Ty  small  angular  jumps  between  frames,  reducing  the  convergence  rate. 
Furthermore,  we  found  that  this  reduction  is  such  that  in  the  limit  of  inUnit esimally 
small  displacements,  where  only  velocity  information  is  used,  the  internal  model  does 
not  even  have  a  full  convergence  to  the  true  structure  of  the  viewed  object,  but.  only  a 
rough  approach  to  the  rornv’t.  solution. 

In  analogy  to  linear  algebra,  one  can  say  that  the  transformations  from  frame  to 
frame  of  the  internal  model  in  the  incremental  rigidity  scheme  become  ill  conditioned  as 
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Figure  14.  (a)  The  best  solution  (right)  obtained  over  four  revolutions  of  the  points  is  ronipared 
with  the  true  positions  (left),  for  the  case  of  the  continuous  formulation,  (b)  The  error  in  the 
internal  3  L)  distances  between  points  in  the  model,  as  a  function  of  time.  The  arrow  in  (b) 
indicates  the  instant  at  which  lire  solution  in  (a)  was  obtained. 


the  angular  displacements  decrease,  i.e.  they  heroine  sensitive  t.o  noise.  This  property 
is  not  unique  to  tin1  present  transformations,  but  occur  in  other  important  situations 
such  as  numerical  differentiation.  The  noise  sensitivity  of  a  differentiation  of  order  rn 
increases  as  ?tm  when  the  number  of  points  n  in  a  given  interval  is  increased  (St  rang, 
1970). 

As  mentioned  before,  the  orthographic  projection  is  in  general  not  physically  valid. 
It  was  used  in  the  present  paper,  because  it  allows  a  deeper  analysis  of  the  phenomena 
studied.  This  analysis  is  further  validated  because  these  results  are  not.  limited  to  the 
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case  of  orthographic  projection,  or  to  the  ease  in  which  llu*  object  rotales.  Under  certain 
conditions,  the  incremental  rigidity  scheme  can  he  generalized  successfully  lo  formula¬ 
tions  that  use  perspective  projection,  and  that  in  this  case,  the  structure  of  the  viewed 
object  is  recoverable  both  for  rotation  and  translation.  Also,  similar  to  the  orthographic 
cruse,  the  convergence  rate  of  the  interna]  model  to  the  t  rue  si  rncl  tire  decreased,  if  smaller 
spatial  changes  between  consecutive  frames  were  used. 


Inspection  of  previous  results  by  other  researchers  suggests  that  the  use  of  informa¬ 
tion  that  is  local  in  space  and  time  is  insufficient  to  allow  a  robust  recovery  of  structure 
from  motion.  We  dist inguish  between  two  types  of  local  information  that  are  relevant  to 
the  problem:  (1)  spatial  locality,  which  refers  to  t  he  use  of  a  small  number  of  points  or 
features  of  the  viewed  object,  and  (2)  temporal  locality,  which  is  the  use  of  a  small  num¬ 
ber  of  views  of  the  object.  Early  algorithms  for  recovering  structure  from  motion  based 
this  recovery  oil  a  limited  number  of  points  and  views,  and  were  able  to  recover  the  struc¬ 
ture  of  the  viewed  object  when  it  was  rigid.  When  an  object  deformed,  some  algorit  hms 
could  determine  its  nonrigidity  through  inconsistencies  of  the  underlying  equations.  It 
followed  that  these  schemes  are  not  robust  against  noise,  as  noise  has  similar  effects  as 
nonrigidity. 

It.  can  he  shown  that  motion  information  that,  is  extended  in  space,  but  temporally 
locri  can  be  used  to  provide  stable  recovery  of  structure  from  motion  (Brass  and  Horn, 
1983;  Negahdaripour  and  Horn,  1985).  Perceptual  evidence,  however,  suggests  that 
spatial  extension  is  not  necessary  for  the  solution  of  the  structure  from  mot  ion  problem 


by  the  human  visual  system  (Borjesson  and  von  Ilofsten,  1973;  Johansson,  1975;  Petersik, 
1980). 

On  the  contrary,  Ullman’s  incremental  rigidity  scheme,  which  was  applied  locally  in 
space,  both  in  its  original  formulation  and  in  the  present  paper,  uses  an  internal  model 
that  is  updated  constantly,  in  order  to  extend  motion  information  over  time,  and  converge 
to  the  correct  solution.  Some  perceptual  studies  suggest  that  this  temporal  extension 
may  be  used  by  the  human  visual  system  (Wallach  and  O'Connell,  1953;  White  and 
Mum,  1900;  Green,  1961). 

Wc  showed  in  the  present  paper,  however,  that  an  algorit  hm  that  overcomes  tempo¬ 
ral  locality  does  not  necessarily  provide  a  robust  solution  to  the  structure  from  -  motion 
problem.  It  seems  that  in  this  case,  a  necessary  condition  for  a  stable'  solution  is  the 
use  of  views  of  the  object  that  are  significantly  disparate.  In  the  limit,  the  use  of  pure 
velocity  information  allows  only  a  rough  solution  to  the  structure  from  motion  problem, 
which  is  less  stable  over  an  extended  time  period.  A  somewhat,  similar  conclusion  was 
derived  by  Uliman  (1983),  who  argued  that  in  a  temporally  local  scheme,  the  recovery 
of  structure  from  the  instantaneous  velocity  field  is  impossible  under  orthographic  pro¬ 
jection,  and  that  for  perspective  projection,  the  recovery  is  unstable.  Wc  suggest  that 
the  problems  with  using  local  velocity  information  are  too  large  to  he  overcome  by  an 
extension  of  this  information  over  time.  In  other  words,  the  inaccuracies  introduced  by 
a  velocity  based  scheme  cannot  bo  corrected  by  the  use  of  multiple  views  of  the  object. 

Our  results  therefore  hint  that  even  in  a  temporally  extended  algorithm,  a  memory 
of  sufficiently  distinct  past  views  or  internal  models  of  the  viewed  object  is  necessary  for 
a  robust  recovery  of  its  structure.  We  point  out  that  the  incremental  rigidity  scheme,  in 
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its  original  formulation  and  in  tin*  pre-iont  paper,  used  im-inory  of  only  one*  past  internal 
model.  Altheiugh  in  principle,  a  memory  of  many  past  models  could  he  used  hy  Hie 
algorithm,  it  is  remarkable  that  such  a  minimal  amount  of  past  information  is  sullieient 
to  provide  a  robust  recovery  of  st  ruct  lire.  Still,  it  may  In' the  case  t  hat  the  use  of  mult  iple 
past  views  or  internal  models  for  establishing  a  now  model  of  3  1)  st  ruct  ure  could  increase 
the  rate  of  convergence  of  the  internal  model. 

In  further  support  of  (he  above  conclusions,  we  note  that  in  order  to  recover  3  D 
structure  from  measured  image  velocities,  it  is  essential  that  those  velocities  at  least  ap¬ 
proximate  the  true  projected  velocities  of  moving  elements  in  space'.  In  general,  however, 
it  is  diiiiciilt  to  derive  real  projected  velocities  from  tin*  optical  ilow  pattern  on  the  eye 
or  camera.  Factors  such  as  changing  illumination,  specularities.  shadows,  and  rotation 
of  an  object  surface'  re*laliv«'  to  a  light  senirce,  can  create'  patte  rns  of  optical  flow  that  elo 
not  correspond  to  the-  revil  movement,  <if  feature's  on  a  physical  surface*.  The*  difficulty  e>f 
computing  correct  proje-cte'el  ve'locitie's  proviele's  an  aelelit  ional  mot  ivatiou  fe>r  not  basing 
the'  rerewery  e>f  structure  from  motion  elirertly  on  velocity  information. 

Our  analysis  raise's  a  number  e>f  issue's  regareliug  the*  re'ce>ve*ry  of  structure'  freun 
meition  in  the  human  visual  syste-m.  First,  it  is  not.  ch'ar  whe'tlie*r  tlm  visual  system 
achie've's  a  stable  sedutiem  to  the  structure  from  motion  preiblem,  e>r  a  rough  solution 
such  as  that  pre>viele>el  by  a  ve-leicity  base-el  scheme.  The  mere  robust  solution  may  ne>t  be 
e'ssential  if  wr  cemsiele'r  that  either  perceptual  earns,  such  as  binocular  ste-re-o,  may  help 
to  improve  the  quality  of  the  3  D  solution  e>bt.ainablo  by  the*  structure  from  motion 
process.  Further  psycopliysical  experiments  are  required  t,e»  examine  this  question. 

A  second  point  of  interest  is  that  emr  analysis  suggests  that  if  the  visual  system 
incorporates  a  robust  solution  to  the  structure  from  motion  proble*m,  it  must  be  able  to 
match  corresponding  points  in  very  disparate  views  of  nmving  eibje-cts.  The  displacements 
between  corresponding  points  may  be  larger  than  the  spatial  limits  proposer!  for  the 
short-  range*  perceptual  process  founel  for  2  D  motion  (Braeldick,  *973,  1974,  1980;  Pantle 
auel  Picciano,  197G;  Pete-rsik,  Hicks  anel  Pantle,  1978;  Petersik  and  Pantle,  1979).  A 
similar  conclusion,  based  on  different  grounels,  was  formulated  by  Pe-t.ersik  (1980).  He 
e-xplove-el  the-  se  nsations  e-lie  iteel  by  streiboscopic  simulatiems  of  a  rotating  transparent 
sphere  lillocl  with  randomly  positioned  dots.  By  manipulating  both  spatial  and  temporal 
variables  in  the  simulation,  he-  eoncluele-el  that  correspemeling  elements  in  consecutive 
frame's  can  he-  matcher)  over  spatial  anel  temporal  distances  that  cxcee-d  the  empirically 
determined  limits  of  the-  2  D  short,  range  proce-ss.  Even  in  this  experiment,  however, 
points  that  reach  the  periphery  of  the*  sphere*  have  small  elisplnccments  freun  frame  to 
frame'  that  may  fall  spatially  insielc  the  short  range*  process.  It.  remains,  therefore,  te>  be 
e'stahlishe’d  that  these  points  ele>  not  provide*  all  the*  information  use-el  by  the  subjec  ts  to 
se-nse  the  cemtinuems  rotation  and  internal  volume  of  the  sphere. 

If  the  human  visual  system  is  able  to  match  very  elistant,  correspemeling  points  from 
elisparate*  vie-ws  e»f  a  meiving  e>bje'ct,  the-  epu-stiem  e»f  lmw  we  seilvo  this  corre-sponeleiicc 
problem  be-e-ome-s  important  (for  a  re-view  e>f  the*  me)tie>ii  correspouelence  problem,  se-e 
Tillman  (1981)).  This  eorresponeloiice*  eeudel  be*  e'stablislie-el  e-ither  fhrougli  the  re-pe-ale-el 
use-  e>f  a  short  range'  matching  preice-ss,  eir  thremgh  the-  use*  e>f  ail  e-xplicit,  le>ng  range 
matching  preiress.  In  the*  first  e  ase*,  the  sheirt.  range  process  could  prewiele  an  essentially 
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continuous  tracking  of  features  in  (.lie  changing  image,  ami  (lie  positions  of  features  could 
then  he  sampled  by  a  longer  range  tracking  process  that  feeds  disparate  positions  into 
the  structure  from  motion  process  (a  similar  idea  has  Iwcn  suggested  by  UJImau  (1081 )). 
In  the  latter  cast',  a  correspondence  would  he  established  directly  from  disparate  views 
of  moving  objects,  without  the  aid  of  the  intermediate  short  range  process.  Petersik’s 
(1980)  results  may  suggest  that  it  is  possible  to  solve  this  long  range  correspondence 
problem  for  recovering  structure.  This  correspondence  problem  could  be  solved  in  con¬ 
junction  with  the  structure  from  motion  process;  that  is,  a  correspondence  could  be 
chosen  that  subsequently  gives  rise  to  the  most  rigid  3  D  interpretation  (such  a  match¬ 
ing  process  requires  a  global  decision  procedure,  and  may  be  well  suited  to  solution  by 
parallel  schemes  such  as  Hopliehl's  “neuronal  nets”  (1984)).  (liven  the  difficulty  of  solv¬ 
ing  this  long  range  correspondence  problem  in  general,  however,  it  seems  unlikely  that 
short  range  measurements  would  not  be  used  when  they  are  available. 
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Appendices 


Appendix  A 

In  this  appendix,  we  present,  some  of  the  details  of  the  iinplementat  ion  of  the  incremental 
rigidity  scheme  used  to  derive  the  simulation  results  presented  in  sections  3  and  4. 


#  1 


Orthographic  Projection:  The  Discrete  Formulation 


Tillman  formulated  the  incremental  rigidity  scheme  as  the  compulat  ion  of  depth  values 
Zi(t')  that  minimize  the  measure  of  rigidity  given  by  D(t  J').  as  shown  in  Eq.  (2.2). 
The  analyses  presented  in  this  paper  mainly  consider  rigid  objects  in  motion,  which  are 
compart,  in  the  sense  that  tin'  internal  distances  between  pairs  of  points  do  not  differ 
much  from  one  another,  hi  this  case,  the  additional  distance  factor  in  the  denonionator 
of  the  measure  used  by  Kthmui  lias  little  influence  on  the  resulting  solution  obtained 
by  the  algorithm.  We  therefore  removed  this  factor  in  most  of  our  simulations,  mid 
minimized  instead  the  following  expression  (the  distances  hj(t)  and  (t1)  are  expressed 
in  terms  of  the  coordinates  of  the  points,  and  (*,-,  Jft,  2»)  refer  to  the  modi'!  coordinates 
at  time  t,  while  (i' , y',  z')  refer  to  the  model  coordinates  at  time  t'): 


Xi  -  Xj)*  +  ( Vi  -  Vj)2  +  (*,  -  Zj)2) 5 


Mil 


The  Zi(t')  that  minimize  Dfi(t,t')  satisfy  a  system  of  nonlinear  equations  given  by: 


dDd 

dzi{t') 


=  0, 


t  -  -  1. 


(A2) 


Rather  than  solving  this  system  of  equations  explicitly,  we  solved  them  implicitly  through 
the  use  of  a  steepest  descent  minimization  algorithm  for  Eq.  (Al).  based  on  the  gra¬ 
dient.  of  Pd(t,t')  (Lucnbcrgor,  1973).  The  components  of  the  gradient  arc  given  by  the 
following: 


dDd 


-*<»• 


i)zi(V) 


M*') 


(A3) 


In  the  case  of  orthographic  projection,  only  relative  depths  ran  be  recovered.  In  the 
implementation,  we  placed  the  initial  point  (xu(t),  y0(t),  zo(t))  at.  the  origin  of  the  co¬ 
ordinate  system;  that  is,  (x,i(t),y«(<),  Zo(*))  —  (9,0,0).  At  every  instant,  the  remaining 
n  —  1  points  wore  then  given  relative  to  this  initial  point.  Unless  otherwise  stated,  the 
initial  configuration  of  points  was  flat,  that  is,  2,(0)  0,  for  I,...,r»  —  i.  From  this 

configuration,  the  algorithm  can  move  toward  two  equally  likely  configurations,  one  be¬ 
ing  the  mirror  image'  reflection  of  the  other  about  the  image  plane.  In  other  words,  this 


4" 

ceinligurat ion  is  local  <*d  at  a  sadellepoiul  in  tin-  solution  space  for  tin-  problem  (tin-  solu¬ 
tion  spare  is  an  n  dimensional  space*  in  wliicli  a  value  for  D,i(t.t')  is  assigned  to  every 
possible  eoinUiuat  ion  of  elepth  values  -(/')),  at  which  the  gradient  of  is  ze*re>.  To 

make  this  gradient  nonzero,  tin*  initial  solution  was  perturbed  slightly,  thus  causing  the 
algorithm  to  move  in  a  particular  direction. 

The  steepest  descent  minimization  method  from  nonlinear  programming  (see,  for  ex¬ 
ample,  Luenbcrgor.  19715)  was  used  to  compute*  the  Zi(t')  for  each  new  frame*.  The* current 
elepth  values  r,(<)  we*re*  use*el  as  the  initial  solutiem  for  2,(f')  te>  begin  the*  minimization 
for  the*  ue*xt:  frame*.  The*  e»bje*e*tive*  function  is  given  by  D,i(t,t'),  anel  its  gradient  by  Eejs. 
(AS).  (I'oleh'it  se*e*tiem  se*are*h  was  use*el  te>  p<*rforiu  the*  e>ne*  dimensional  minimization 
within  the  ste*e*pe'st  descent  method. 


Orthographic  Projection:  The  Continuous  Formulation 


In  sex* Lion  2,  we*  pre*se*ute*el  a  continuous  formulation,  which  re*eptire*s  the  ceimputation  of 
z  eompone'iits  of  velocity,  i,(<)  that  minimize*  the  me  asure  of  rigidity  give*n  by  £>„(/),  as 
e*xpre*sseel  in  Kq.  (2.3).  In  the*  computer  simulations,  we*  use*el  the  particular  measure  of 
e>ve*r;dl  deviation  from  rigielit.y  given  by: 


[(*«  ~  gjHjb  ~  Sj)  +  (Vi  ~  Vj){Vi  ~  Vi)  +  (gf  ~  gj)(g»  -  *;)la 
(x{  -  ly)2  +  (yt  -  yjy  +  (2<  -  2y)2 


DM  =  E 

M 

The  i,(l)  that  minimize  Dd{t)  satisfy  a  system  of  linear  equations  given  by: 

dDc 


(A4) 


dzi(t) 

These  equations  are  of  the  form: 


—  0,  t  =  1,. . .  ,n  -  1. 


(A5) 


(*»  -  Xjjjxj  -  Xj)  +  (t/,  -  yj)(yi  -  i/j)  +  (z,  -  Zj ) ( Zj  -  Zj) 


dzi 


(x,  -  xy )2  +  (y,*  -  yj )2  +  (*,.  -  Zj) 2 


(Zi  ~  Zj)  --  0. 


(A6) 


In  the  case*  of  orthographic  projection,  only  relative  z  components  of  velocity  can  be  rc- 
cewe*re*d.  As  in  the  discrete*  e  ase,  we  again  placoel  the  initial  point  (x()(f),  tjo(t),  ^o(O)  at  the 
origin  of  the*  coordinate  system  throughemt.  the  entire*  motion,  so  that  (x0 (t),yo(t),  z»(t))  = 
(xo(<),  yo(t),  2«(0)  =  (lb  0,0).  At  cveiry  instant  the  remaining  n  -  1  z  components  of  vc- 
locity  we*re  then  given  relative  to  this  initial  point..  Unless  otherwise  stall'd,  we  again 
began  with  a  flat  initial  configuration  in  which  ~,(0)  =  0,  for  —  1,  which  was 

perturbed  slightly  so  that  the*  gradient  of  Dc(t)  is  initially  nonzero. 

At  e*ae*h  moment.  z,(t)  w<*re*  oht.aiui'd  by  seilviug  a  system  of  linear  equations,  for 
wliie-h  we  used  the*  simple*  ( Jauss  Se*iele*l  relaxatiem  method.  The*  initial  roneliliun  for  the 
re'laxatiou  was  usually  the  set  of  v<*h>ei  tie's  computed  in  (lie*  pre'viems  iteration  and  Zi  —  0 
for  the  lirst  iteration.  Te>  integrate  motion  iufnnnatiem  e>ve*r  an  exte*nele*d  time  pi*rie»el,  we 
t he*n  made  use*  of  the*  following  approximations  using  Xi(t),  jfi(t),  anel  i,(f): 
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X{(t)  4  Xi(t)fit  -  Xi(l  t-  fit) 

J/.(0  -I  -I  «)  (AG) 

Zi(t)  4  Zi{t)fit  -  z,(t  I  fit). 

Omit  tin-  were  computed,  by  minimizing  Dr(t),  Mini  (tic*  c,(<4  fit)  could  lx*  derived 
from  Mio  current  model  Zi(t).  The  new  modi'!  was  then  takcu  as  M(t  (  fit)  -  (;r,(<  I 

fit),yi{t  4  fit),Zi(t  4  fit)),  and  the  process  w;is  continued.  The  time  interval,  fit ,  typically 
corresponded  to  an  angular  displacement  between  frames  on  the  order  of  0.1  degrees.  We 
also  experimented  with  angular  displacements  up  to  two  orders  of  magnitude  smaller, 
and  found  the  qualitative  behavior  of  the  algorithm  to  he  the  same. 


Perspective  Projection:  The  Discrete  Formulation 

In  the  case  of  perspective  project  ion,  the  x  and  y  coordinates  at  time  t.'  can  he  expressed 
in  terms  of  the  known  image  coordinates  at  time  t'  and  the  depth  values  to  he  computed, 
Zi(t'),  as  follows: 


Xi(t')  -  Ui(t')zi(t') 

yi(t')  ~  ( A 7) 

The  measure  of  rigidity  to  be  minimized  is  given  by  substituting  Eqs.  (A7)  into  Eq. 
(Al): 


Ai(<,  t')  ~  Y,  ((*<  -  */)*  +  (»*  “  »y)a  +  iz<  ~  zj)2)  ’ 


(A8) 


-  {{<2i  ~  u'jz'j)2  +  (viz'i  ~  v'j2'/?  +  i2'i  -  2j)2)  5 
The  that  minimize  Dj(t,t')  satisfy  a  system  of  nonlinear  equations  given  by: 


dDd 

dzi(t') 


=  0, 


*  =  1. 


(A9) 


In  the  computer  simulations,  these  equations  vere  solved  implicitly  through  the  use  of  a 
steepest  descent  minimization  algorithm  for  10q.  (A8),  based  on  the  gradient  of  Dj(t,  t'). 
The  components  of  the  gradient  are  given  by  the  following: 


dDd 

()Zi(V) 


-2^  l~ji—  -  uj2'j)  +  V'(”M  ~  +  (2l  -  2j)]  ■ 


lij 


(A  10) 


Tlie  computed  Zi(t')  are  used  to  derive  the  new  x,(l')  and  y,(t')  using  Eqs.  (A7),  which 
then  provide  tin*  new  computed  3  D  model.  This  ii<>w  model  also  serves  as  an  initial 
solution  to  begin  the  minimization  for  the  next  frame. 


irtki 

* 

*TCr 
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Perspective  Projection:  The  Continuous  Formulation 

Ill  the  continuous  case,  the  x  mu  I  </  components  of  velocity  of  the  points  in  space  can  he 
expressed  in  terms  of  I  lie  known  image  coordinate's  and  velocities  as  follows: 


X,(t.)  «,(/.)',(/)  I  Ui(t-)Zi{t) 

■11,(1)  --  <’,(l)zi(l)  I  e,(/)-,(0- 

If  we  let.  Xij  -  %i  -  Xj,  yl}  -•  ?/,  -  yj.  Zij  ~  a*  -  zj,  and 
expressed  in  terms  of  the  coordinates  of  the  points  as  follows: 


(dll) 

Zj,  then  l)n(t )  is 


r,  i/i  _  X-'  |a*y («,•«,•  f  Ujij  -  ujzj)  I  y, :j(v,-Zi  |  «,•* 

_  ;r,  jr,2 


xa  1  (  *.v 


(d  1 2 


The  Zi(t)  that  minimize  satisfy  a  system  of  linear  ecpiations  given  by: 


<)DC 

— -  =  0  »  =  I. 


(dl3) 


These  equations  arc'  of  the  form: 


i)Dc 


—j~  =  2  £  ba  («<(*»  -  *y)  +  «<(»i  -  3/j)  +  (*,  -  *,-)]  =  0, 


(d!4) 


where  the-  ft,-y  arc  give  n  by: 


i  _  +  ««•?<  ~  ttyiy  -  f  Vijb’iZi  +  w,-z<  -  VjZj  -  VjZj)  +  ZijZij 

TOO  • 


*?.  -f-  y? ■  +  z ?• 


(d!5) 


At  each  moment,  the  i,(£)  wore  obtained  by  solving  the  system  of  linear  equations  given 
by  Eqs.  (A14),  for  which  we  used  the  simple  Gauss  Seidel  relaxation  method.  After  the 
Zi(t)  wore  known,  the  Eqs.  (All)  were  used  to  derive  the  x  and  y  components  of  velocity 
in  space,  i,(l)  ami  yx(t).  To  integrate  motion  information  over  an  extended  time  period, 
wo  then  made  use  of  the  approximations  given  in  Eqs.  (A6)  to  compute  a  now  model 
(xi(t  +  St),iji(t  I-  6<),  Zi(t  +  6t)). 

Appendix  B 


In  this  appendix,  we  explain  how  Eq.  (3.10)  was  derived. 

Eq  3.8  depends  on  the  image  coordinates  (data)  and  oil  depth  coordinates  of  the 
points  Z{(i)  and  their  velocities  i{(£)  (computed  value's).  These  depth  variables  are  in 
general  displaced  from  the  true  motion  values.  This  departure  can  he  incorporated  in 
Eq.  (3.8)  as  follows: 


a  u  ,  .  ;  •„  .  ,  .  . 

..  .  ( f  i  I  f  1 ...  :  Zn  1  “t"  C  n  tin  l)  —  0.  (-f^f) 

(/ 


( 


52 


■  A"*  4's  l**'l 


If  the  <  ami  i  are  small  enough,  as  it  is  assumed  in  I  lie  st  ability  analysis  of  section 
0.2.2,  then  the  first  elements  of  the*  Taylor  expansion  of  Eq.  (13 1 )  yields  the  following 
approximation: 


0  —  (f  i  %  h  •••»  %n  %n  1 ) 


"h  i)z  dz  l)Zi,...,Zn  l)fj 

V-1  i)D  .  .  .  •  ... 

+  ±'9w/t-“ . . *“  ,)'y' 

3  1 


But  the  true  rigid  solution  certainly  minimizes  Eq.  (3.7)  as  it  sets  it  to  0,  thus  being  a 
solution  of  Eq.  (3.8).  It.  follows  that  the  first  term  on  the  righthand  side  of  Eq.  (B2)  is 
0,  from  which  Eq.  (3.10)  is  concluded. 

Appendix  C 

In  this  appendix  we  indicate  the  method  by  which  equations  of  the  type  3.14  and  3.15 
were  derived. 

Equation  3.14  is  derived  by  taking  the  necessary  partial  derivatives  of  D  (Eq.  (3.7)) 
by  Zi  and  i,  and  combining  them  as  indicated  in  Eq.  (3.11).  These  derivative's  are 
evaluated  at  the  true  solution  itself,  that  is  at  i^(t)  and  £,(t),  and  because  we  only 
considered  rigid  motions  in  this  paper,  we  could  use  the  following  relationship  to  simplify 
the  results: 

<Hj  +  (h  ~  ij)(zi  -  *j)  =  0.  (Cl) 

Eq.  (Cl)  is  derived  by  setting  the  distances  between  points  »  and  j  as  constant  and 
taking  the  temporal  derivative  of  this  distance. 

The  conclusion  of  Eq.  (3.15)  is  derived  by  noting  that,  in  a  rigid  rotation  around 
an  axis  perpendicular  to  the  viewing  axis,  with  constant  angular  velocity  w,  i,  can  be 
written  as: 

i,(f)  =  c,cos(u/t  f  (C2) 

Then,  by  direct  substitution  of  Eq.  (C2)  in  Eq.  (3.14)  and  its  integration  as  indicated 
in  Eq.  (3.15),  we  derive  the  stated  result. 

The  calculations  explained  in  this  appendix,  though  straightforward,  .are  cumber¬ 
some,  and  were  done  by  using  Macsyma,  a  computer  system  for  performing  algebraic 
manipulation. 

Appendix  D 

In  this  appendix  we  show  that  if  C  is  an  v  x  n  matrix  over  an  algebraically  closed  field 
then: 
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,i  «.•  4,|  »»L »» *m.W jh1  .ffi 


^(C)=/»(C>), 


where  p  is  the  spectral  radius. 
Let  P  be  such  that: 


P  1CP  =  J>  (D2) 

where  J  is  in  Jordan  canonical  form.  It  follows  that: 

C j  -  PJJP_1.  (D3) 

Similar  matrices  have  similar  characteristic  values  and  thus  similar  spectral  radii,  thus: 

p(C)  =  p(J),  (DA) 


p(V)=p(Jj).  (D  5) 

Direct  inspection  of  the  matrices  J3  show  that  they  have  the  same  type  of  triangulariza* 
tion  as  J  (that  is,  upper  or  lower  triangularization),  with  the  elements  in  the  diagonal 
being  the  j  —  th  power  of  those  of  J.  Thus  it  follows  that: 

p(J3)  =  ^(J),  (D6) 

which  together  with  Eqs.  (C4)  and  (C5)  imply  the  result  stated  in  Eq.  (Dl). 


